All of lore.kernel.org
 help / color / mirror / Atom feed
* Why is max_cstate=1 still needed?
@ 2011-03-11 12:18 Jiri Slaby
  2011-03-11 14:02 ` Thomas Gleixner
  2011-03-11 14:02 ` Thomas Gleixner
  0 siblings, 2 replies; 4+ messages in thread
From: Jiri Slaby @ 2011-03-11 12:18 UTC (permalink / raw)
  To: Len Brown
  Cc: linux-pm, x86@kernel.org, Linux kernel mailing list,
	Thomas Gleixner, Jiri Slaby

Hello,

there are still reports against the latest kernels, that people need to
pass processor/intel_idle.max_cstate=1 to successfully boot the kernel.
The symptoms are always the same, until the parameter is specified OR
until the user presses a key, the system won't boot up.

This started to appear between 2.6.31 and 2.6.34 (possibly a 2.6.33
regression) and continues to be reported against the latest stable
2.6.37.3. For example:
https://bugzilla.kernel.org/show_bug.cgi?id=15289
https://bugzilla.novell.com/show_bug.cgi?id=579932
https://bugzilla.novell.com/show_bug.cgi?id=673589
https://bugzilla.novell.com/show_bug.cgi?id=675161

I see that there were some fixes in .38-rc in this bug report (they look
unrelated):
https://bugzilla.kernel.org/show_bug.cgi?id=29992

Should they give .38-rc a try?

Any help would be appreciated.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Why is max_cstate=1 still needed?
@ 2011-03-11 12:18 Jiri Slaby
  0 siblings, 0 replies; 4+ messages in thread
From: Jiri Slaby @ 2011-03-11 12:18 UTC (permalink / raw)
  To: Len Brown
  Cc: linux-pm, x86@kernel.org, Thomas Gleixner,
	Linux kernel mailing list, Jiri Slaby

Hello,

there are still reports against the latest kernels, that people need to
pass processor/intel_idle.max_cstate=1 to successfully boot the kernel.
The symptoms are always the same, until the parameter is specified OR
until the user presses a key, the system won't boot up.

This started to appear between 2.6.31 and 2.6.34 (possibly a 2.6.33
regression) and continues to be reported against the latest stable
2.6.37.3. For example:
https://bugzilla.kernel.org/show_bug.cgi?id=15289
https://bugzilla.novell.com/show_bug.cgi?id=579932
https://bugzilla.novell.com/show_bug.cgi?id=673589
https://bugzilla.novell.com/show_bug.cgi?id=675161

I see that there were some fixes in .38-rc in this bug report (they look
unrelated):
https://bugzilla.kernel.org/show_bug.cgi?id=29992

Should they give .38-rc a try?

Any help would be appreciated.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is max_cstate=1 still needed?
  2011-03-11 12:18 Why is max_cstate=1 still needed? Jiri Slaby
  2011-03-11 14:02 ` Thomas Gleixner
@ 2011-03-11 14:02 ` Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2011-03-11 14:02 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Jiri Slaby, linux-pm, x86@kernel.org, Linux kernel mailing list

On Fri, 11 Mar 2011, Jiri Slaby wrote:
> there are still reports against the latest kernels, that people need to
> pass processor/intel_idle.max_cstate=1 to successfully boot the kernel.
> The symptoms are always the same, until the parameter is specified OR
> until the user presses a key, the system won't boot up.
> 
> This started to appear between 2.6.31 and 2.6.34 (possibly a 2.6.33
> regression) and continues to be reported against the latest stable
> 2.6.37.3. For example:
> https://bugzilla.kernel.org/show_bug.cgi?id=15289
> https://bugzilla.novell.com/show_bug.cgi?id=579932
> https://bugzilla.novell.com/show_bug.cgi?id=673589
> https://bugzilla.novell.com/show_bug.cgi?id=675161
> 
> I see that there were some fixes in .38-rc in this bug report (they look
> unrelated):
> https://bugzilla.kernel.org/show_bug.cgi?id=29992
> 
> Should they give .38-rc a try?

Trying does no damage :(
 
> Any help would be appreciated.

I went through the bug reports briefly. While they all report the same
symptoms (hangs until key pressed) the root cause varies.

   - SMM C1E handler broken (affects AMD only)
   - HPET issues (mostly AMD)
   - The usual ACPI/BIOS madness

On most of those systems nohz=off hides the problem as well as it
prevents deeper power states, so the local apic timer just keeps
ticking and the broadcast via PIT/HPET is not activated. hpet=disable
is another way to work around it.

To be honest we have no real handle on all of this as much of the
wreckage is hidden deep in that black hole of ACPI/BIOS. We grew some
quirks and detection mechanisms over time, but there seems to be a
never ending source of trouble especially as HW vendors seem to add
more power related features into the BIOS. We've seen perf wreckage as
well as some of those abuse performance counters :(

Of course those "features" are only tested against that other OS, some
of them even require a driver counterpart for the other OS. Of course
we have no information about that at all and the HW vendors are
helpful as ever.

Yes, it's sad, but reality.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is max_cstate=1 still needed?
  2011-03-11 12:18 Why is max_cstate=1 still needed? Jiri Slaby
@ 2011-03-11 14:02 ` Thomas Gleixner
  2011-03-11 14:02 ` Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2011-03-11 14:02 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Len Brown, linux-pm, x86@kernel.org, Linux kernel mailing list,
	Jiri Slaby

On Fri, 11 Mar 2011, Jiri Slaby wrote:
> there are still reports against the latest kernels, that people need to
> pass processor/intel_idle.max_cstate=1 to successfully boot the kernel.
> The symptoms are always the same, until the parameter is specified OR
> until the user presses a key, the system won't boot up.
> 
> This started to appear between 2.6.31 and 2.6.34 (possibly a 2.6.33
> regression) and continues to be reported against the latest stable
> 2.6.37.3. For example:
> https://bugzilla.kernel.org/show_bug.cgi?id=15289
> https://bugzilla.novell.com/show_bug.cgi?id=579932
> https://bugzilla.novell.com/show_bug.cgi?id=673589
> https://bugzilla.novell.com/show_bug.cgi?id=675161
> 
> I see that there were some fixes in .38-rc in this bug report (they look
> unrelated):
> https://bugzilla.kernel.org/show_bug.cgi?id=29992
> 
> Should they give .38-rc a try?

Trying does no damage :(
 
> Any help would be appreciated.

I went through the bug reports briefly. While they all report the same
symptoms (hangs until key pressed) the root cause varies.

   - SMM C1E handler broken (affects AMD only)
   - HPET issues (mostly AMD)
   - The usual ACPI/BIOS madness

On most of those systems nohz=off hides the problem as well as it
prevents deeper power states, so the local apic timer just keeps
ticking and the broadcast via PIT/HPET is not activated. hpet=disable
is another way to work around it.

To be honest we have no real handle on all of this as much of the
wreckage is hidden deep in that black hole of ACPI/BIOS. We grew some
quirks and detection mechanisms over time, but there seems to be a
never ending source of trouble especially as HW vendors seem to add
more power related features into the BIOS. We've seen perf wreckage as
well as some of those abuse performance counters :(

Of course those "features" are only tested against that other OS, some
of them even require a driver counterpart for the other OS. Of course
we have no information about that at all and the HW vendors are
helpful as ever.

Yes, it's sad, but reality.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-03-11 14:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-11 12:18 Why is max_cstate=1 still needed? Jiri Slaby
2011-03-11 14:02 ` Thomas Gleixner
2011-03-11 14:02 ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2011-03-11 12:18 Jiri Slaby

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.