All of lore.kernel.org
 help / color / mirror / Atom feed
* [lm-sensors] w83627 driver bug report
@ 2012-06-14 13:33 Igor Netkachev
  2012-06-14 22:07 ` Guenter Roeck
  2012-06-18  8:30 ` Igor Netkachev
  0 siblings, 2 replies; 3+ messages in thread
From: Igor Netkachev @ 2012-06-14 13:33 UTC (permalink / raw)
  To: lm-sensors


[-- Attachment #1.1: Type: text/plain, Size: 3777 bytes --]

Greetengs,


We experience permanent problem with w83627dhg driver on serveral machines.

Software version: lm_sensors-2.10.7-9.el5
Sensors driver name: w83627dhg-isa-0a10
OS: CentOS release 5.8 (Final)


============
Description:
============
We use lm_sensors together with nagios/nrpe check_sensors plugin in order
to monitor sensors' status. The problem is that after editing sensors.conf
according to our needs (e.g. ignoring inactive fans or setting min and max
values for specific sensors) and applying the changes with "sensors -s"
command, the sensor's configuration drops back to defaults at a random
moment usually within ~5-30 hours after "sensors -s" has been run. This
makes nagios/nrpe to set the false alarm and send an e-mail to the
customer, and this brings us a lot of pain as long as it happens almost
every night.

========
Example:
========
Below there's sensors' output right after it fell back to defaults:

root@working ~ # sensors
w83627dhg-isa-0a10
Adapter: ISA adapter
Case Fan: 11637 RPM  (min = 12053 RPM, div = 1) ALARM
CPU Fan:  11739 RPM  (min =    0 RPM, div = 1) ALARM
CPU Temp:  +38.0 C  (high =  +0.0 C, hyst = +60.0 C)  [CPU diode ]
AUX Temp:  +29.5 C  (high = +45.0 C, hyst = +60.0 C)  [thermistor]
vid:      +1.300 V

Running "sensors -s" again solves the problem...

root@working ~ # sensors -s
root@working ~ # sensors
w83627dhg-isa-0a10
Adapter: ISA adapter
Case Fan: 11440 RPM  (min = 6026 RPM, div = 2)
CPU Fan:  11637 RPM  (min = 6026 RPM, div = 2)
CPU Temp:  +38.0 C  (high = +56.0 C, hyst = +60.0 C)  [CPU diode ]
AUX Temp:  +29.5 C  (high = +45.0 C, hyst = +60.0 C)  [thermistor]
vid:      +1.300 V

but only for ~5-30 hours, until it drops back to defaults again.

At the moment we haven't found any dependencies between the bug itself and
os/chassis, it occurs on different machines and OSes (so far we had it on
CentOS 5/6 and Debian 5/6).
Please investigate. Feel free to request any additional information you
might need.


A piece of config file related to w83627dhg-isa-0a10 is below:

====
# Winbond W83627EHF configuration originally contributed by Leon Moonen
# This is for an Asus P5P800, voltages for A8V-E SE.
chip "w83627ehf-*" "w83627dhg-*"

#    set fan1_min    2000
#    set fan2_min    2000
    ignore in0
    ignore in1
    ignore in2
    ignore in3
    ignore in4
    ignore in5
    ignore in6
    ignore in7
    ignore in8
    ignore fan5
    label in0 "VCore"
    label in2 "AVCC"
    label in3 "3VCC"
    label in7 "VSB"
    label in8 "VBAT"

# The W83627DHG has no in9, uncomment the following line
#    ignore in9

# +12V is in1 and +5V is in6 as recommended by datasheet
    compute in1 @*(1+(56/10)),  @/(1+(56/10))
    compute in6 @*(1+(22/10)),  @/(1+(22/10))
#    set in1_min   12.0*0.9
#    set in1_max   12.0*1.1
#    set in6_min   5.0*0.95
#    set in6_max   5.0*1.05

# Set the 3.3V
#    set in2_min   3.3*0.95
#    set in2_max   3.3*1.05
#    set in3_min   3.3*0.95
#    set in3_max   3.3*1.05
#    set in7_min   3.3*0.95
#    set in7_max   3.3*1.05
#    set in8_min   3.3*0.95
#    set in8_max   3.3*1.05

# Fans
   label fan1      "Case Fan"
   label fan2      "CPU Fan"
   label fan3      "Aux Fan"
#   ignore fan1
#   ignore fan2
   ignore fan3
   ignore fan4
#   ignore fan5
#  ignore fan4

set fan1_min 6000
set fan2_min 6000
set fan4_min 6000
set fan5_min 6000

# Temperatures
   #set temp2_max   90
   #set temp2_over  95
   #set temp2_hyst  110
   label temp1     "Sys Temp"
   label temp2     "CPU Temp"
   label temp3     "AUX Temp"

ignore temp1
#   ignore temp2
#   ignore temp3
#  ignore temp3
  set temp2_over  56
  set temp2_hyst  60

  set temp3_over  45
  set temp3_hyst  60
====


-- 
Best regards, Igor Netkachev
NK support
netkachev@nksupport.com
http://www.nksupport.com

[-- Attachment #1.2: Type: text/html, Size: 4772 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [lm-sensors] w83627 driver bug report
  2012-06-14 13:33 [lm-sensors] w83627 driver bug report Igor Netkachev
@ 2012-06-14 22:07 ` Guenter Roeck
  2012-06-18  8:30 ` Igor Netkachev
  1 sibling, 0 replies; 3+ messages in thread
From: Guenter Roeck @ 2012-06-14 22:07 UTC (permalink / raw)
  To: lm-sensors

On Thu, Jun 14, 2012 at 09:33:01AM -0400, Igor Netkachev wrote:
> Greetengs,
> 
> 
> We experience permanent problem with w83627dhg driver on serveral machines.
> 
> Software version: lm_sensors-2.10.7-9.el5
> Sensors driver name: w83627dhg-isa-0a10
> OS: CentOS release 5.8 (Final)
> 
> 
> ======
> Description:
> ======
> We use lm_sensors together with nagios/nrpe check_sensors plugin in order to
> monitor sensors' status. The problem is that after editing sensors.conf
> according to our needs (e.g. ignoring inactive fans or setting min and max
> values for specific sensors) and applying the changes with "sensors -s"
> command, the sensor's configuration drops back to defaults at a random moment
> usually within ~5-30 hours after "sensors -s" has been run. This makes nagios/
> nrpe to set the false alarm and send an e-mail to the customer, and this brings
> us a lot of pain as long as it happens almost every night.
> 
> ====
> Example:
> ====
> Below there's sensors' output right after it fell back to defaults:
> 
> root@working ~ # sensors
> w83627dhg-isa-0a10
> Adapter: ISA adapter
> Case Fan: 11637 RPM  (min = 12053 RPM, div = 1) ALARM
> CPU Fan:  11739 RPM  (min =    0 RPM, div = 1) ALARM
> CPU Temp:  +38.0 C  (high =  +0.0 C, hyst = +60.0 C)  [CPU diode ]
> AUX Temp:  +29.5 C  (high = +45.0 C, hyst = +60.0 C)  [thermistor]
> vid:      +1.300 V
> 
> Running "sensors -s" again solves the problem...
> 
> root@working ~ # sensors -s
> root@working ~ # sensors
> w83627dhg-isa-0a10
> Adapter: ISA adapter
> Case Fan: 11440 RPM  (min = 6026 RPM, div = 2)
> CPU Fan:  11637 RPM  (min = 6026 RPM, div = 2)
> CPU Temp:  +38.0 C  (high = +56.0 C, hyst = +60.0 C)  [CPU diode ]
> AUX Temp:  +29.5 C  (high = +45.0 C, hyst = +60.0 C)  [thermistor]
> vid:      +1.300 V
> 
> but only for ~5-30 hours, until it drops back to defaults again.
> 
> At the moment we haven't found any dependencies between the bug itself and os/
> chassis, it occurs on different machines and OSes (so far we had it on CentOS 5
> /6 and Debian 5/6).
> Please investigate. Feel free to request any additional information you might
> need.
> 
Hi Igor,

it almost looks like the chip might reset itself.

Of course that could be caused by anything, but it is odd that it happens 
on multiple machines. Do those machines use IPMI or ACPI to access the chip,
by any chance ? ASUS systems do that, for example, and the use of a hwmon chip
driver is generally not recommended for such machines and may cause all sorts
of issues.

Thanks,
Guenter


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [lm-sensors] w83627 driver bug report
  2012-06-14 13:33 [lm-sensors] w83627 driver bug report Igor Netkachev
  2012-06-14 22:07 ` Guenter Roeck
@ 2012-06-18  8:30 ` Igor Netkachev
  1 sibling, 0 replies; 3+ messages in thread
From: Igor Netkachev @ 2012-06-18  8:30 UTC (permalink / raw)
  To: lm-sensors


[-- Attachment #1.1: Type: text/plain, Size: 5203 bytes --]

Hi Guenter,

We are a support service provider, so we had this bug on different machines
with different OSes (UNIX-based) in completely different datacenters. At
the moment we have two machines with this problem. Here comes BIOS and
motherboard info from dmidecode's output for both:

========
Machine 1
========
BIOS Information
        Vendor: American Megatrends Inc.
        Version: 1.2
        Release Date: 08/19/11
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 4096 kB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                ESCD support is available
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                LS-120 boot is supported
                ATAPI Zip drive boot is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 8.16

System Information
        Manufacturer: Supermicro
        Product Name: X8SIE
        Version: 0123456789
        Serial Number: 0123456789
        UUID: 49434D53-0200-9008-2500-089025002931
        Wake-up Type: Power Switch
        SKU Number: To Be Filled By O.E.M.
        Family: To Be Filled By O.E.M.

Base Board Information
        Manufacturer: Supermicro
        Product Name: X8SIE
        Version: 0123456789
        Serial Number: 0123456789
        Asset Tag: To Be Filled By O.E.M.
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: To Be Filled By O.E.M.
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0


========
Machine 2
========
BIOS Information
        Vendor: American Megatrends Inc.
        Version: 1.1a
        Release Date: 09/27/10
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 4096 kB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                ESCD support is available
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                LS-120 boot is supported
                ATAPI Zip drive boot is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 8.16

System Information
        Manufacturer: Supermicro
        Product Name: X8SIT
        Version: 0123456789
        Serial Number: 0123456789
        UUID: 49434D53-0200-900D-2500-0D902500453E
        Wake-up Type: Power Switch
        SKU Number: To Be Filled By O.E.M.
        Family: To Be Filled By O.E.M.

Base Board Information
        Manufacturer: Supermicro
        Product Name: X8SIT
        Version: 0123456789
        Serial Number: 0123456789
        Asset Tag: To Be Filled By O.E.M.
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: To Be Filled By O.E.M.
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0


-- 
Best regards, Igor Netkachev
NK support
netkachev@nksupport.com
http://www.nksupport.com


2012/6/15 Guenter Roeck <guenter.roeck@ericsson.com>

> Hi Igor,
>
> it almost looks like the chip might reset itself.
>
> Of course that could be caused by anything, but it is odd that it happens
> on multiple machines. Do those machines use IPMI or ACPI to access the
> chip,
> by any chance ? ASUS systems do that, for example, and the use of a hwmon
> chip
> driver is generally not recommended for such machines and may cause all
> sorts
> of issues.
>
> Thanks,
> Guenter
>
>

[-- Attachment #1.2: Type: text/html, Size: 6121 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-06-18  8:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-14 13:33 [lm-sensors] w83627 driver bug report Igor Netkachev
2012-06-14 22:07 ` Guenter Roeck
2012-06-18  8:30 ` Igor Netkachev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.