All of lore.kernel.org
 help / color / mirror / Atom feed
* [lm-sensors] after some days,
@ 2010-05-11  8:55 Arne Riecken
  2010-05-11  9:39 ` Jean Delvare
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Arne Riecken @ 2010-05-11  8:55 UTC (permalink / raw)
  To: lm-sensors

Hi,

after some days some min or max values for sensors get zero and
therefore an alaram is triggered. sonsors -s cures that but only if
the values are defined in the sensors3.conf. If default values are
used, there is no way out, You have to manually set them.

chip "w83627ehf-*" "w83627dhg-*"

Linux 2.6.26-2-xen-amd64

OS Debian 5.0 "Lenny"

For example today VBAT max was suddenly 0, I had to reread it with
sensors -s as You see below the correct value again.

How to fix that?

# sensors
w83627dhg-isa-0a10
Adapter: ISA adapter
VCore:       +0.91 V  (min =  +0.60 V, max =  +1.49 V)
in1:        +12.20 V  (min = +10.82 V, max = +13.20 V)
AVCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
3VCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
in4:         +1.55 V  (min =  +1.35 V, max =  +1.65 V)
in6:         +4.74 V  (min =  +4.25 V, max =  +5.25 V)
VSB:         +3.26 V  (min =  +2.98 V, max =  +3.47 V)
VBAT:        +3.12 V  (min =  +2.80 V, max =  +3.47 V)
CPU Fan:    2636 RPM  (min = 1205 RPM, div = 8)
Aux Fan:    12735 RPM  (min = 8035 RPM, div = 1)
Sys Temp:    +31.0°C  (high = +60.0°C, hyst = +50.0°C)  sensor = thermistor
CPU Temp:    +38.5°C  (high = +95.0°C, hyst = +90.0°C)  sensor = diode
AUX Temp:    +38.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = diode
cpu0_vid:   +1.300 V

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [lm-sensors] after some days,
  2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
@ 2010-05-11  9:39 ` Jean Delvare
  2010-05-11 10:57 ` Arne Riecken
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jean Delvare @ 2010-05-11  9:39 UTC (permalink / raw)
  To: lm-sensors

Hi Arne,

On Tue, 11 May 2010 10:55:27 +0200, Arne Riecken wrote:
> after some days some min or max values for sensors get zero and
> therefore an alaram is triggered. sonsors -s cures that but only if
> the values are defined in the sensors3.conf. If default values are
> used, there is no way out, You have to manually set them.
> 
> chip "w83627ehf-*" "w83627dhg-*"
> 
> Linux 2.6.26-2-xen-amd64
> 
> OS Debian 5.0 "Lenny"
> 
> For example today VBAT max was suddenly 0, I had to reread it with
> sensors -s as You see below the correct value again.

Actually, sensors -s is _writing_ the limits to the chip, not reading
them.

> How to fix that?
> 
> # sensors
> w83627dhg-isa-0a10
> Adapter: ISA adapter
> VCore:       +0.91 V  (min =  +0.60 V, max =  +1.49 V)
> in1:        +12.20 V  (min = +10.82 V, max = +13.20 V)
> AVCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
> 3VCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
> in4:         +1.55 V  (min =  +1.35 V, max =  +1.65 V)
> in6:         +4.74 V  (min =  +4.25 V, max =  +5.25 V)
> VSB:         +3.26 V  (min =  +2.98 V, max =  +3.47 V)
> VBAT:        +3.12 V  (min =  +2.80 V, max =  +3.47 V)
> CPU Fan:    2636 RPM  (min = 1205 RPM, div = 8)
> Aux Fan:    12735 RPM  (min = 8035 RPM, div = 1)
> Sys Temp:    +31.0°C  (high = +60.0°C, hyst = +50.0°C)  sensor = thermistor
> CPU Temp:    +38.5°C  (high = +95.0°C, hyst = +90.0°C)  sensor = diode
> AUX Temp:    +38.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = diode
> cpu0_vid:   +1.300 V

Which hardware is this? Odds are that the registers are being
overwritten by the BIOS, or the hardware is faulty.

If you can easily reproduce the problem, the first thing to try is to
unload the w83627ehf driver and check whether the problem still
happens. You'll have to use "isadump 0xa15 0xa16" to dump the register
values, and check for the value of registers 0x2b to 0x3e. My guess is
that you will see them go to 0 even without the driver. If I am right,
this will prove the driver is innocent.

It would also be a good idea to disassemble the ACPI DSDT table to check
whether APCI is poking at the hardware monitoring registers. If you
send your table to me in private, I'll take a look.

-- 
Jean Delvare
http://khali.linux-fr.org/wishlist.html

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [lm-sensors] after some days,
  2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
  2010-05-11  9:39 ` Jean Delvare
@ 2010-05-11 10:57 ` Arne Riecken
  2010-05-12  7:33 ` Jean Delvare
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Arne Riecken @ 2010-05-11 10:57 UTC (permalink / raw)
  To: lm-sensors

Thanks,

mainboard is a Super Micro X8SIE-F

I do not know how to reproduce it. It happens every week or so.



2010/5/11 Jean Delvare <khali@linux-fr.org>:
> Hi Arne,
>
> On Tue, 11 May 2010 10:55:27 +0200, Arne Riecken wrote:
>> after some days some min or max values for sensors get zero and
>> therefore an alaram is triggered. sonsors -s cures that but only if
>> the values are defined in the sensors3.conf. If default values are
>> used, there is no way out, You have to manually set them.
>>
>> chip "w83627ehf-*" "w83627dhg-*"
>>
>> Linux 2.6.26-2-xen-amd64
>>
>> OS Debian 5.0 "Lenny"
>>
>> For example today VBAT max was suddenly 0, I had to reread it with
>> sensors -s as You see below the correct value again.
>
> Actually, sensors -s is _writing_ the limits to the chip, not reading
> them.
>
>> How to fix that?
>>
>> # sensors
>> w83627dhg-isa-0a10
>> Adapter: ISA adapter
>> VCore:       +0.91 V  (min =  +0.60 V, max =  +1.49 V)
>> in1:        +12.20 V  (min = +10.82 V, max = +13.20 V)
>> AVCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
>> 3VCC:        +3.26 V  (min =  +3.14 V, max =  +3.47 V)
>> in4:         +1.55 V  (min =  +1.35 V, max =  +1.65 V)
>> in6:         +4.74 V  (min =  +4.25 V, max =  +5.25 V)
>> VSB:         +3.26 V  (min =  +2.98 V, max =  +3.47 V)
>> VBAT:        +3.12 V  (min =  +2.80 V, max =  +3.47 V)
>> CPU Fan:    2636 RPM  (min = 1205 RPM, div = 8)
>> Aux Fan:    12735 RPM  (min = 8035 RPM, div = 1)
>> Sys Temp:    +31.0°C  (high = +60.0°C, hyst = +50.0°C)  sensor = thermistor
>> CPU Temp:    +38.5°C  (high = +95.0°C, hyst = +90.0°C)  sensor = diode
>> AUX Temp:    +38.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = diode
>> cpu0_vid:   +1.300 V
>
> Which hardware is this? Odds are that the registers are being
> overwritten by the BIOS, or the hardware is faulty.
>
> If you can easily reproduce the problem, the first thing to try is to
> unload the w83627ehf driver and check whether the problem still
> happens. You'll have to use "isadump 0xa15 0xa16" to dump the register
> values, and check for the value of registers 0x2b to 0x3e. My guess is
> that you will see them go to 0 even without the driver. If I am right,
> this will prove the driver is innocent.
>
> It would also be a good idea to disassemble the ACPI DSDT table to check
> whether APCI is poking at the hardware monitoring registers. If you
> send your table to me in private, I'll take a look.
>
> --
> Jean Delvare
> http://khali.linux-fr.org/wishlist.html
>

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [lm-sensors] after some days,
  2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
  2010-05-11  9:39 ` Jean Delvare
  2010-05-11 10:57 ` Arne Riecken
@ 2010-05-12  7:33 ` Jean Delvare
  2010-05-17  8:01 ` Arne Riecken
  2010-05-20 12:03 ` Jean Delvare
  4 siblings, 0 replies; 6+ messages in thread
From: Jean Delvare @ 2010-05-12  7:33 UTC (permalink / raw)
  To: lm-sensors

On Tue, 11 May 2010 11:39:33 +0200, Jean Delvare wrote:
> It would also be a good idea to disassemble the ACPI DSDT table to check
> whether APCI is poking at the hardware monitoring registers. If you
> send your table to me in private, I'll take a look.

I took a look, I see the BIOS is defining an I/O region as part of
PNP0C02 for the hardware monitoring chip (you probably can see it
in /proc/ioports), but it doesn't seem to make any use of it. So I
would rule out a BIOS interaction.

-- 
Jean Delvare

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [lm-sensors] after some days,
  2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
                   ` (2 preceding siblings ...)
  2010-05-12  7:33 ` Jean Delvare
@ 2010-05-17  8:01 ` Arne Riecken
  2010-05-20 12:03 ` Jean Delvare
  4 siblings, 0 replies; 6+ messages in thread
From: Arne Riecken @ 2010-05-17  8:01 UTC (permalink / raw)
  To: lm-sensors

What to do now to track down the problem? I need to check the sensors
data. But doing sensors -s in a cron job is no real solution.

2010/5/12 Jean Delvare <khali@linux-fr.org>:
> On Tue, 11 May 2010 11:39:33 +0200, Jean Delvare wrote:
> I took a look, I see the BIOS is defining an I/O region as part of
> PNP0C02 for the hardware monitoring chip (you probably can see it
> in /proc/ioports), but it doesn't seem to make any use of it. So I
> would rule out a BIOS interaction.
>
> --
> Jean Delvare
>

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [lm-sensors] after some days,
  2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
                   ` (3 preceding siblings ...)
  2010-05-17  8:01 ` Arne Riecken
@ 2010-05-20 12:03 ` Jean Delvare
  4 siblings, 0 replies; 6+ messages in thread
From: Jean Delvare @ 2010-05-20 12:03 UTC (permalink / raw)
  To: lm-sensors

Hi Arne,

Please don't top-post.

> 2010/5/12 Jean Delvare <khali@linux-fr.org>:
> > On Tue, 11 May 2010 11:39:33 +0200, Jean Delvare wrote:
> > I took a look, I see the BIOS is defining an I/O region as part of
> > PNP0C02 for the hardware monitoring chip (you probably can see it
> > in /proc/ioports), but it doesn't seem to make any use of it. So I
> > would rule out a BIOS interaction.

On Mon, 17 May 2010 10:01:41 +0200, Arne Riecken wrote:
> What to do now to track down the problem? I need to check the sensors
> data. But doing sensors -s in a cron job is no real solution.

We still have to determine whether the driver is responsible for the
issue or not. If you can't live without the driver being loaded, we
would have to either add an option to turn the driver to read-only
mode, or to log all writes so that you can check what happens exactly.
In both cases, this requires rebuilding the driver. Are you willing and
able to do this? If you are, let me know and I'll provide a patch.

Honestly, I suspect a hardware defect. Nobody else ever reported a
similar problem, while this driver is very popular.

-- 
Jean Delvare

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-20 12:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-11  8:55 [lm-sensors] after some days, Arne Riecken
2010-05-11  9:39 ` Jean Delvare
2010-05-11 10:57 ` Arne Riecken
2010-05-12  7:33 ` Jean Delvare
2010-05-17  8:01 ` Arne Riecken
2010-05-20 12:03 ` Jean Delvare

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.