All of lore.kernel.org
 help / color / mirror / Atom feed
* [lm-sensors] Tyan S2877 CPU Temp Erratic
@ 2006-11-12 23:47 Paul Reilly
  2006-11-24 13:07 ` Rudolf Marek
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Paul Reilly @ 2006-11-12 23:47 UTC (permalink / raw)
  To: lm-sensors


Hello,
I have a Tyan S2877 motherboard. I had lm-sensors working perfectly,
but every now and then the CPU temp field, goes wrong. For instance
it was reporting around 50'C for ages, then after a reboot, it now
reports -199.75?C . Opening the box, and checking all is OK. BIOS
reports correct temp (around 48'C). Restarting multiple times, and
still can't get it to go back to report correct CPU temp.

I am using the LM85 module/code forced to use a ADT7463 chip
as this best matches my hardware. My sensors.conf and output of
sensors is shown below. NOTE, I only have one CPU in this dual
socket board.

CPU1 Temp:-199.75?C  (low  =   -71?C, high =   -73?C)

Any one seen this problem before or know a fix?
The CPU is a dual-core Opteron 265.
Thanks in advance for any help.
Paul

# sensors
w83627hf-isa-0c00
Adapter: ISA adapter
CPU2 Volt: +0.14 V  (min =  +0.00 V, max =  +4.08 V)
+3.3V:     +3.44 V  (min =  +2.82 V, max =  +3.79 V)
CPU1 DIMM Volt:
           +2.62 V  (min =  +1.55 V, max =  +3.58 V)
Chassis Fan 2:
          8881 RPM  (min = 7258 RPM, div = 2)
Chassis Fan 1:
          3792 RPM  (min = 4090 RPM, div = 2)              ALARM
alarms:

adt7463-i2c-1-2e
Adapter: SMBus nForce2 adapter at a040
CPU2 DIMM Volt:
           +0.000 V  (min =  +2.47 V, max =  +2.73 V)   ALARM
CPU1 Volt: +1.383 V  (min =  +1.28 V, max =  +1.42 V)
+5V:      +5.007 V  (min =  +4.74 V, max =  +5.26 V)
+12V:     +11.938 V  (min = +11.38 V, max = +12.62 V)
CPU1 Fan:  5960 RPM  (min =    0 RPM)
CPU2 Fan:     0 RPM  (min =    0 RPM)
fan3:      9152 RPM  (min =    0 RPM)
CPU2 Temp:-71.25?C  (low  =  -200?C, high =   -73?C)     ALARM FAULT
CPU1 Temp:-199.75?C  (low  =   -71?C, high =   -73?C)
vid:      +1.550 V  (VRM Version 2.4)

------------------------------------------------------------
#######
#    Sensors configuration file used by 'libsensors' for Tyan S2877
#
#  To support NFORCE4 SMBus controller, version 2.9.1 is at least
#  If using Linux Kernel 2.6.x, version 2.6.12 is at least
#
#  To your /etc/modules.conf file, add the lines:
#     alias char-major-89 i2c-dev
#
#  To your /etc/rc.xxx files, add the lines:
#     modprobe i2c-nforce2
#     modprobe i2c-isa
#     modprobe lm85 force_adm1027=1,0x2e
#     modprobe w83627hf
#     sensors -s
#
#  Then copy this file to /etc/sensors.conf
#
#  Notes:
#
# Edited by: Raphael Deng <raphaeld at tyan.com> 04.10.06
# Changed @08/07/2006
# Add "force_adm1027" parameter @ 08/08/2006
#######


#chip "adm1027-*"
chip "adt7463-*"

    ignore pwm1
    ignore pwm2
    ignore pwm3
    ignore fan4
    ignore in2
    ignore in5
    ignore in6
    ignore in7
    ignore temp2

    label in0 "CPU2 DIMM Volt"
    label in1 "CPU1 Volt"
    label in3 "+5V"
    label in4 "+12V"

    label fan1 "CPU1 Fan"
    label fan2 "CPU2 Fan"
    label fan4 "Chassis Fan 0"

    label temp1 "CPU2 Temp"
    label temp3 "CPU1 Temp"

    set in0_min  2.6 * 0.95
    set in0_max  2.6 * 1.05
    set in1_min  1.35 * 0.95
    set in1_max  1.35 * 1.05
    set in3_min  5.0 * 0.95
    set in3_max  5.0 * 1.05
    set in4_min  12.0 * 0.95
    set in4_max  12.0 * 1.05

    compute temp1 @-72,  @+72
    compute temp3 @-72,  @+72

chip "w83627hf-*"

        ignore pwm1
  ignore pwm2
        ignore pwm3
        ignore in1
        ignore in3
        ignore in4
        ignore in6
        ignore in7
        ignore in8
        ignore fan3
        ignore temp1
        ignore temp2
        ignore temp3
        ignore vid
        ignore alarm
        ignore beep_enable

        label in0 "CPU2 Volt"
        label in2 "+3.3V"
        label in5 "CPU1 DIMM Volt"

        label fan2 "Chassis Fan 1"
        label fan1 "Chassis Fan 2"

# current lm sensors has some issue on these 2 sensors type supporting
#       label temp2 "CPU1 VRM Temp"
#       label temp1 "CPU2 VRM Temp"

#       set sensor1 3435
#       set sensor2 3435


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
@ 2006-11-24 13:07 ` Rudolf Marek
  2006-11-25 10:02 ` Paul Reilly
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Rudolf Marek @ 2006-11-24 13:07 UTC (permalink / raw)
  To: lm-sensors

Paul Reilly wrote:
> Hello,
> I have a Tyan S2877 motherboard. I had lm-sensors working perfectly,
> but every now and then the CPU temp field, goes wrong. For instance
> it was reporting around 50'C for ages, then after a reboot, it now
> reports -199.75?C . Opening the box, and checking all is OK. BIOS
> reports correct temp (around 48'C). Restarting multiple times, and
> still can't get it to go back to report correct CPU temp.

Hmm this may be some electrical problems or problems with the bus driver. Please 
can you switch on the i2c bus debugging in the kernel. If it is a production 
machine and you cannot experiment much with it you may try to use the k8temp 
driver which will read the CPU temperature directly from the processor.

The driver is in 2.6.19.

1) you may patch the kernel with patch for 2.6.19 - patches 
(http://lists.lm-sensors.org/pipermail/lm-sensors/2006-August/017426.html)

2) use standalone version from here: 
http://assembler.cz/download/amd_digital_temp.tar.gz

3) lm-sensors 2.10.1 has the userspace support.

> I am using the LM85 module/code forced to use a ADT7463 chip
> as this best matches my hardware. My sensors.conf and output of
> sensors is shown below. NOTE, I only have one CPU in this dual
> socket board.
> 
> CPU1 Temp:-199.75?C  (low  =   -71?C, high =   -73?C)
> 
> Any one seen this problem before or know a fix?
> The CPU is a dual-core Opteron 265.

Oh one more idea. Please remove the "thermal" driver from kernel (rmmod thermal) 
or compile kernel without thermal zone support. Also, please send us the output of

cat /proc/acpi/dsdt > /tmp/dsdt.bin

Thanks,
Regards
Rudolf


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
  2006-11-24 13:07 ` Rudolf Marek
@ 2006-11-25 10:02 ` Paul Reilly
  2006-11-25 10:04 ` Rudolf Marek
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Reilly @ 2006-11-25 10:02 UTC (permalink / raw)
  To: lm-sensors


Thanks Rudolf,

I now use K8-temp and it seems to work.

k8temp-pci-00c3
Adapter: PCI adapter
CPU#0 Core1: +54?C
CPU#0 Core2: +59?C

I am using a sensors.conf of:

chip "k8temp-*"
        label temp1 "CPU#0 Core1"
        label temp3 "CPU#0 Core2"
        label temp2 "CPU#1 Core1"
        label temp4 "CPU#1 Core2"

Is this correct?
I get no reading on temp2 or temp4, which is right as I only have
one dual core Opteron in this two socket board.

Thanks,
Paul


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
  2006-11-24 13:07 ` Rudolf Marek
  2006-11-25 10:02 ` Paul Reilly
@ 2006-11-25 10:04 ` Rudolf Marek
  2006-11-25 10:07 ` Rudolf Marek
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Rudolf Marek @ 2006-11-25 10:04 UTC (permalink / raw)
  To: lm-sensors

Hi,

Please CC to the list.

Paul Reilly wrote:
> Hi Rudolf,
> 
> Thanks for your help. The temp readings came back for a while,
> but are now gone again. Interesting they do change, from -199.5
> to -197.5 etc. So it seems to be affected by possible the real
> reading from the CPU.

Or corrupted.

> 
>> cat /proc/acpi/dsdt > /tmp/dsdt.bin
> 
> http://astropaul.com/export/dsdt.bin

Ok I did not found anything suspect here.


> I cannot make too many changes to this machine. It runs a static
> kernel ( LM85 module is also compiled statically). As suggested I
> have turned off ACPI thermal zone, and turned on i2c bus debugging.
> It will be rebooted tomorrow, and I will report back

Good thanks.

> Will K8 Temp work with Opteron chip?

It will but this cool feature is undocumented by AMD for some reason. (for the 
pre F. rev of cores)

regards
Rudolf


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
                   ` (2 preceding siblings ...)
  2006-11-25 10:04 ` Rudolf Marek
@ 2006-11-25 10:07 ` Rudolf Marek
  2006-11-25 12:48 ` Rudolf Marek
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Rudolf Marek @ 2006-11-25 10:07 UTC (permalink / raw)
  To: lm-sensors

Hi,

> OK, I have patched my kernel 2.6.16.30  with support for K8 Temp
> and
> 
> [*] AMD K8 processor sensor
> [*] National Semiconductor LM85 and compatibles

Good.

> I will not be able to reboot as is is a production machine.
> We can reboot over the weekend only.

Ok.

> Do you have an example sensors.conf  entry for the K8
> sensor to get the temperature values? It looks like there
> 2 sensors per core, so I should be able to monitor 4 temp
> with my Opteron? Great!

If your opteron supports 2 temps/core then yes ;)

Yes please check this page: 
http://www.lm-sensors.org/browser/lm-sensors/trunk/etc/sensors.conf.eg

hip "k8temp-*"

    label temp1 "Core0 Temp"
    label temp2 "Core0 Temp"
    label temp3 "Core1 Temp"
    label temp4 "Core1 Temp"

Rudolf


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
                   ` (3 preceding siblings ...)
  2006-11-25 10:07 ` Rudolf Marek
@ 2006-11-25 12:48 ` Rudolf Marek
  2006-11-25 13:21 ` Paul Reilly
  2006-11-25 21:10 ` Rudolf Marek
  6 siblings, 0 replies; 8+ messages in thread
From: Rudolf Marek @ 2006-11-25 12:48 UTC (permalink / raw)
  To: lm-sensors

Hi,

> Thanks Rudolf,

Heh good ;)

> I now use K8-temp and it seems to work.
> 
> k8temp-pci-00c3
> Adapter: PCI adapter
> CPU#0 Core1: +54?C
> CPU#0 Core2: +59?C
> 
> I am using a sensors.conf of:
> 
> chip "k8temp-*"
>         label temp1 "CPU#0 Core1"
>         label temp3 "CPU#0 Core2"
>         label temp2 "CPU#1 Core1"
>         label temp4 "CPU#1 Core2"
> 
> Is this correct?
> I get no reading on temp2 or temp4, which is right as I only have
> one dual core Opteron in this two socket board.

Theoretically there could be 4 temps for one physical CPU. Two for each core * 2 
= 4. And because your CPU is dualcore, and your CPU supports only 1 
temperature/core so you see two of them ;) I hope I clarified that.

So all in all, this works fine. Lets check the i2c bus related problem. And wait 
for some logs from you ;)

Regards
Rudolf


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
                   ` (4 preceding siblings ...)
  2006-11-25 12:48 ` Rudolf Marek
@ 2006-11-25 13:21 ` Paul Reilly
  2006-11-25 21:10 ` Rudolf Marek
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Reilly @ 2006-11-25 13:21 UTC (permalink / raw)
  To: lm-sensors


Hi Rudolf,

> Theoretically there could be 4 temps for one physical CPU. Two for each core
> * 2 = 4. And because your CPU is dualcore, and your CPU supports only 1
> temperature/core so you see two of them ;) I hope I clarified that.

Yes, I see from the k8temp.c
static SENSOR_DEVICE_ATTR_2(temp1_input, S_IRUGO, show_temp, NULL, 0, 0);
static SENSOR_DEVICE_ATTR_2(temp2_input, S_IRUGO, show_temp, NULL, 0, 1);
static SENSOR_DEVICE_ATTR_2(temp3_input, S_IRUGO, show_temp, NULL, 1, 0);
static SENSOR_DEVICE_ATTR_2(temp4_input, S_IRUGO, show_temp, NULL, 1, 1);

So,
temp1 = CPU#0 Core#0 Temp #0
temp2 = CPU#0 Core#0 Temp #1
temp3 = CPU#0 Core#1 Temp #0
temp4 = CPU#0 Core#1 Temp #1

My Dual Core Opteron (265) only reports temp1, and temp3.
That is OK. One temperature per core.

What would it look like if I add a second physical processor in
the second socket? Would I get the same lmsensor names repeated?

> So all in all, this works fine. Lets check the i2c bus related problem. And
> wait for some logs from you ;)

Here are the logs.
It looks like a timeout problem....

syslog:Nov 25 08:31:58 discover kernel: i2c /dev entries driver
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: ISA main adapter
registered
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: nForce2 SMBus adapter
at 0xa000
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: nForce2 SMBus adapter
at 0xa040
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: Driver w83627hf
registered
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: Driver w83781d-isa
registered
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [lm-sensors] Tyan S2877 CPU Temp Erratic
  2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
                   ` (5 preceding siblings ...)
  2006-11-25 13:21 ` Paul Reilly
@ 2006-11-25 21:10 ` Rudolf Marek
  6 siblings, 0 replies; 8+ messages in thread
From: Rudolf Marek @ 2006-11-25 21:10 UTC (permalink / raw)
  To: lm-sensors

Hi,

> What would it look like if I add a second physical processor in
> the second socket? Would I get the same lmsensor names repeated?

Yes, it will be just new instance of the driver. So, there will be 
k8temp-pci-00xx where xx is something else here for next physical CPU (in fact 
it is PCI bus address)

> Here are the logs.
> It looks like a timeout problem....
> 
> syslog:Nov 25 08:31:58 discover kernel: i2c /dev entries driver
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: ISA main adapter
> registered
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: nForce2 SMBus adapter
> at 0xa000
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: nForce2 SMBus adapter
> at 0xa040
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: Driver w83627hf
> registered
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-9191: Driver w83781d-isa
> registered
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-0: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)
> syslog:Nov 25 08:31:58 discover kernel: i2c_adapter i2c-1: SMBus Timeout! (0x10)


Well this is from probing phase. It might happen that the probed devices are not 
on that address. It seems the SMBus controller follows the ACPI 2.00 proposal 
(13.9.1.1 Status Register, SMB_STS ), and 0x10 means: Indicates the transaction 
failed because the slave device address was not acknowledged.

I will need part of the log just after you issued the "sensors" command. (when 
there is some error in log...) also when you will have the error condition, 
(strange temperature) please send the output of following command:

i2cdump -y 1 0x2e

This will show if the chip reports sane values in registers, if so then there is 
something wrong with the "analog" path to the chip.

Does it work when you poweroff the machine - unplug the power cable? And try 
again? What kernel are you using? When it was not working and you entered the 
BIOS I guess it worked in there? (no need to hurry with real test, the i2cdump 
will be enough maybe)

Problem is that I dont have the nVidia datasheet, so I need to guess, when it 
will turn out that there is a problem with bus driver...

Regards
Rudolf





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-11-25 21:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-12 23:47 [lm-sensors] Tyan S2877 CPU Temp Erratic Paul Reilly
2006-11-24 13:07 ` Rudolf Marek
2006-11-25 10:02 ` Paul Reilly
2006-11-25 10:04 ` Rudolf Marek
2006-11-25 10:07 ` Rudolf Marek
2006-11-25 12:48 ` Rudolf Marek
2006-11-25 13:21 ` Paul Reilly
2006-11-25 21:10 ` Rudolf Marek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.