public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* rdmsr_safe_on_cpu hangs?
@ 2008-02-26 21:58 Dan Upton
  2008-02-27  6:51 ` H. Peter Anvin
  0 siblings, 1 reply; 2+ messages in thread
From: Dan Upton @ 2008-02-26 21:58 UTC (permalink / raw)
  To: linux-kernel

I'm seeing this behavior in both 2.6.23.14 and 2.6.24.3, on x86-64 on
a Core2 Duo.  Where I'm working on temperature-based scheduling, I've
added a few places that basically duplicate the calls to rdmsr_on_cpu
from hwmon/coretemp.c to places in sched.c and sched_debug.c.  All of
the instances in sched_debug.c are of course only accessed once the
system has booted all the way, and I haven't run into any problems
reading (and getting correct values) like that.  When I saw
rdmsr_on_cpu hang, I switched to using rdmsr_safe_on_cpu.  I thought
that was supposed to fail gracefully, but it still seems to be
hanging.  I have two different problems:

-In the 2.6.23.14 kernel, I was trying to read via a function called
from sched_balance_self.  It seems to work fine until it becomes aware
of the second core (ie, rdmsr_safe_on_cpu(0, IA32_THERM_STATUS, &eax,
&edx) works fine, but rdmsr_safe_on_cpu(1, ...) never returns).
-In the 2.6.24.3 kernel, it works fine when I call it from
sched_balance_self.  I added another place to call the function from
prepare_task_switch, so I could save some relevant information before
swapping the task away, and it eventually hangs reading on core
0--obviously after "Booting the kernel", but before "Red Hat nash"
starting.

I guess the question is, am I just misunderstanding the use of
rdmsr_safe_on_cpu, or is it an issue with that particular MSR (some of
the stuff I've read indicates that rdmsr_safe was really only
implemented as a prequel to the coretemp driver), or is it something
wrong with rdmsr_safe_on_cpu?

-dan

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: rdmsr_safe_on_cpu hangs?
  2008-02-26 21:58 rdmsr_safe_on_cpu hangs? Dan Upton
@ 2008-02-27  6:51 ` H. Peter Anvin
  0 siblings, 0 replies; 2+ messages in thread
From: H. Peter Anvin @ 2008-02-27  6:51 UTC (permalink / raw)
  To: Dan Upton; +Cc: linux-kernel

Dan Upton wrote:
> 
> I guess the question is, am I just misunderstanding the use of
> rdmsr_safe_on_cpu, or is it an issue with that particular MSR (some of
> the stuff I've read indicates that rdmsr_safe was really only
> implemented as a prequel to the coretemp driver), or is it something
> wrong with rdmsr_safe_on_cpu?
> 

*msr_safe_*() can only trap #GP's, which would be manifest as oopses if 
you used the unsafe versions.  If the CPU *hangs* when touching an MSR, 
nothing we can do in software will help.

	-hpa

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-02-27  6:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-26 21:58 rdmsr_safe_on_cpu hangs? Dan Upton
2008-02-27  6:51 ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox