* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
@ 2006-03-30 21:45 ` Rudolf Marek
2006-03-31 10:27 ` Jean Delvare
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Rudolf Marek @ 2006-03-30 21:45 UTC (permalink / raw)
To: lm-sensors
Hello Dieter,
> Dear listmembers,
> I am using several Tyan Tiger 2460 motherboards since quite a while. And I am
> observing a strange behaviour someone in the list (might) have a clue about.
>
> I am reading the sensors data periodically every 10 seconds. After an
> undefined time (maybe one week, maybe one day, maybe three weeks) sensors
> "hangs" after start and cannot be killed any more. A manual call to sensors
> ends in a hang as well (of sensors, the board is running on).
>
Well perhaps I will have future questions:
Hmm this seems some race condition in the kernel maybe? If you do ps ax
can you see the D state of the process? Any OOPs message in the log?
Do you use preempt? (perhaps not)
Does cat hang if you do for example cat some value from /sys/bus/i2c/devices.../temp1_input
Regards
Rudolf
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
2006-03-30 21:45 ` Rudolf Marek
@ 2006-03-31 10:27 ` Jean Delvare
2006-03-31 14:50 ` Steven Timm
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Jean Delvare @ 2006-03-31 10:27 UTC (permalink / raw)
To: lm-sensors
Hi Dieter,
> I am using several Tyan Tiger 2460 motherboards since quite a while. And I am
> observing a strange behaviour someone in the list (might) have a clue about.
>
> I am reading the sensors data periodically every 10 seconds. After an
> undefined time (maybe one week, maybe one day, maybe three weeks) sensors
> "hangs" after start and cannot be killed any more. A manual call to sensors
> ends in a hang as well (of sensors, the board is running on).
>
> What is my configuration:
> I am loading
> MODULE_0=i2c-amd756
> MODULE_1=i2c-isa
> MODULE_2=w83781d
> MODULE_3îprom
> MODULE_4=w83627hf
> (in this sequence).
>
> By recommendation from Tyan I use the line:
> options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49
> force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0
>
> in /etc/modprobe.conf.local
This doesn't make much sense. You ask the w83781d driver to take care
of the W83627HF chip, but also load the w83627hf driver. Both drivers
will attempt to handle the same chip, this might explain the deadlock.
I'd suggest that you try without loading the w83627hf driver. If the
w83781d driver is able to handle both chips (the W83782D and the
W83627HF) then the w83627hf driver is not needed at all.
--
Jean Delvare
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
2006-03-30 21:45 ` Rudolf Marek
2006-03-31 10:27 ` Jean Delvare
@ 2006-03-31 14:50 ` Steven Timm
2006-03-31 19:32 ` Dieter Jurzitza
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Steven Timm @ 2006-03-31 14:50 UTC (permalink / raw)
To: lm-sensors
On Fri, 31 Mar 2006, Jean Delvare wrote:
> Hi Dieter,
>
>> I am using several Tyan Tiger 2460 motherboards since quite a while. And I am
>> observing a strange behaviour someone in the list (might) have a clue about.
>>
>> I am reading the sensors data periodically every 10 seconds. After an
>> undefined time (maybe one week, maybe one day, maybe three weeks) sensors
>> "hangs" after start and cannot be killed any more. A manual call to sensors
>> ends in a hang as well (of sensors, the board is running on).
>>
>> What is my configuration:
>> I am loading
>> MODULE_0=i2c-amd756
>> MODULE_1=i2c-isa
>> MODULE_2=w83781d
>> MODULE_3îprom
>> MODULE_4=w83627hf
>> (in this sequence).
>>
>> By recommendation from Tyan I use the line:
>> options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49
>> force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0
>>
>> in /etc/modprobe.conf.local
>
> This doesn't make much sense. You ask the w83781d driver to take care
> of the W83627HF chip, but also load the w83627hf driver. Both drivers
> will attempt to handle the same chip, this might explain the deadlock.
This board and other Tyan boards of that era have actually got two chips,
one of which is the W83627HF, and all of them have got some strange
subclients like that which are necessary. When I was running that
board there was a custom sensors.conf that was necessary too to
interpret the output correctly, the stock one that comes with
the source doesn't cut it. At that time Tyan had people who
made it for us, not sure if those folks are still with Tyan or not.
On the closely-related tyan 2466 board, I was following what
Jean said and only loading the W83781d driver. That should work.
Steve
>
> I'd suggest that you try without loading the w83627hf driver. If the
> w83781d driver is able to handle both chips (the W83782D and the
> W83627HF) then the w83627hf driver is not needed at all.
>
>
--
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/
Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (2 preceding siblings ...)
2006-03-31 14:50 ` Steven Timm
@ 2006-03-31 19:32 ` Dieter Jurzitza
2006-04-02 18:18 ` Dieter Jurzitza
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Dieter Jurzitza @ 2006-03-31 19:32 UTC (permalink / raw)
To: lm-sensors
Dear listmembers,
I tried to load the 83781d only according to your suggestions, and, yes,
everything is working like before. The driver detects the sensor chips and
reads all values from both chips. And, yes, the 83627 is read out even though
the corresponding module is not loaded. As readily mentioned, the deadlock
takes a while, so I'll wait and report what is happening.
This, however, leads me to the question why there is a different module for
the w83781d and the w83627hf at all - if the first can handle both?
Thanks again
take care
Dieter
--
-----------------------------------------------------------
|
\
/\_/\ |
| ~x~ |/-----\ /
\ /- \_/
^^__ _ / _ ____ /
<??__ \- \_/ | |/ | |
|| || _| _| _| _|
if you really want to see the pictures above - use some font
with constant spacing like courier! :-)
-----------------------------------------------------------
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (3 preceding siblings ...)
2006-03-31 19:32 ` Dieter Jurzitza
@ 2006-04-02 18:18 ` Dieter Jurzitza
2006-04-05 8:28 ` Rudolf Marek
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Dieter Jurzitza @ 2006-04-02 18:18 UTC (permalink / raw)
To: lm-sensors
Dear Rudolf, dear listmembers,
unfortunately it was not the problem that both drivers (w83781d and w83627hf)
are running (apparently). Today the same thing happened again even though the
83627hf is not loaded any more.
The "D"-state:
fred/Downloads> ps ax | grep sensors
2980 ? D 0:00 /usr/bin/sensors -c /etc/sensors.tempconf
and cat hangs when trying a
fred/Downloads> cat /sys/bus/i2c/devices/0-002c/alarms
but only for a while, then comes back with -257
(strange value, isn't it?)
fred/Downloads> cat /sys/bus/i2c/devices/0-002c/temp1_input
hangs for a while too, then comes back with -1000
The same happens for the devices named 0-002d/XXX.
By the way, this directory looks like:
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-002c
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-002c
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-002d
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-002d
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0048
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0048
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0049
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0049
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-004a
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-004a
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-004b
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-004b
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0050
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0050
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0051
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0051
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0052
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0052
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 0-0054
-> ../../../devices/pci0000:00/0000:00:07.3/i2c-0/0-0054
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 1-0050
-> ../../../devices/pci0000:00/0000:00:0c.0/i2c-1/1-0050
lrwxrwxrwx 1 root root 0 2006-04-02 11:20 1-0061
-> ../../../devices/pci0000:00/0000:00:0c.0/i2c-1/1-0061
By the way: what is preempt? How would it show?
There is one thing I found in addition: /usr/bin/sensors does not really hang
"forever". It only hangs for a certain time, than comes back - but the values
returned are meaningless. Just like cat <one of the values in /sys/XXX> hangs
for about 80 seconds, I think sensors hangs for N*80seconds, where N is the
number of values it reads.
Thank you for any suggestion,
take care
Dieter
Am Donnerstag, 30. M?rz 2006 23:45 schrieben Sie:
*******
>
> Hmm this seems some race condition in the kernel maybe? If you do ps ax
> can you see the D state of the process? Any OOPs message in the log?
>
> Do you use preempt? (perhaps not)
> Does cat hang if you do for example cat some value from
> /sys/bus/i2c/devices.../temp1_input
>
> Regards
> Rudolf
********
--
-----------------------------------------------------------
|
\
/\_/\ |
| ~x~ |/-----\ /
\ /- \_/
^^__ _ / _ ____ /
<??__ \- \_/ | |/ | |
|| || _| _| _| _|
if you really want to see the pictures above - use some font
with constant spacing like courier! :-)
-----------------------------------------------------------
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (4 preceding siblings ...)
2006-04-02 18:18 ` Dieter Jurzitza
@ 2006-04-05 8:28 ` Rudolf Marek
2006-04-09 10:16 ` Dieter Jurzitza
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Rudolf Marek @ 2006-04-05 8:28 UTC (permalink / raw)
To: lm-sensors
Dieter Jurzitza wrote:
> Dear Rudolf, dear listmembers,
> unfortunately it was not the problem that both drivers (w83781d and w83627hf)
> are running (apparently). Today the same thing happened again even though the
> 83627hf is not loaded any more.
>
> The "D"-state:
> fred/Downloads> ps ax | grep sensors
> 2980 ? D 0:00 /usr/bin/sensors -c /etc/sensors.tempconf
>
> and cat hangs when trying a
> fred/Downloads> cat /sys/bus/i2c/devices/0-002c/alarms
> but only for a while, then comes back with -257
> (strange value, isn't it?)
>
This sounds like bus problem to me, If you use only the w83627hf, I think the
issue should disappear.
If you wish to help solve the issue continue using the w93781d and
tick in the config of i2c the debug stuff. (Bus driver debug) So we will see if
there are some problems with transaction.
Regards
Rudolf
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (5 preceding siblings ...)
2006-04-05 8:28 ` Rudolf Marek
@ 2006-04-09 10:16 ` Dieter Jurzitza
2006-04-09 10:29 ` Rudolf Marek
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Dieter Jurzitza @ 2006-04-09 10:16 UTC (permalink / raw)
To: lm-sensors
Dear Rudolf,
would you kindly tell me where activate this debugging stuff? I looked a bit
through the sensors-sources but I haven't got a clue.
Thank you very much,
take care
Dieter
Am Mittwoch, 5. April 2006 10:28 schrieb Rudolf Marek:
*******
> If you wish to help solve the issue continue using the w93781d and
> tick in the config of i2c the debug stuff. (Bus driver debug) So we will
*******
--
-----------------------------------------------------------
|
\
/\_/\ |
| ~x~ |/-----\ /
\ /- \_/
^^__ _ / _ ____ /
<??__ \- \_/ | |/ | |
|| || _| _| _| _|
if you really want to see the pictures above - use some font
with constant spacing like courier! :-)
-----------------------------------------------------------
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (6 preceding siblings ...)
2006-04-09 10:16 ` Dieter Jurzitza
@ 2006-04-09 10:29 ` Rudolf Marek
2006-04-18 18:43 ` Dieter Jurzitza
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Rudolf Marek @ 2006-04-09 10:29 UTC (permalink / raw)
To: lm-sensors
Dieter Jurzitza wrote:
> Dear Rudolf,
> would you kindly tell me where activate this debugging stuff? I looked a bit
> through the sensors-sources but I haven't got a clue.
If you have the 2.6 kernel and I think you have so it should be
in kernel menu in i2c section.
Regards
Rudolf
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (7 preceding siblings ...)
2006-04-09 10:29 ` Rudolf Marek
@ 2006-04-18 18:43 ` Dieter Jurzitza
2006-04-18 20:30 ` Rudolf Marek
2006-04-18 20:53 ` Dieter Jurzitza
10 siblings, 0 replies; 12+ messages in thread
From: Dieter Jurzitza @ 2006-04-18 18:43 UTC (permalink / raw)
To: lm-sensors
Dear Rudolf,
dear listmembers,
after building the kernel with debugging active, I flood my /var/log/messages
file (something like 100MByte per day).
What I get is attached. What is apparent to me is the fact that there are
error messages every other time (I just attached several seconds) and non
regular accesses (should happen every ten seconds) but no hang like before.
So I have to wait a little more. If I am unlucky the delays introduced by the
tons of printfs prevent sensors from crashing my sensor chip. But maybe some
of you guys can understand why the errors are being reported.
Thank you in advance,
take care
Dieter Jurzitza
Tyan Tiger 2460, SuSE 9.3, kernel 2.6.11-SMP (have seen similar issues with
older kernels, too).
--
-----------------------------------------------------------
|
\
/\_/\ |
| ~x~ |/-----\ /
\ /- \_/
^^__ _ / _ ____ /
<??__ \- \_/ | |/ | |
|| || _| _| _| _|
if you really want to see the pictures above - use some font
with constant spacing like courier! :-)
-----------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logpart.bz2
Type: application/x-bzip2
Size: 1336 bytes
Desc: not available
Url : http://lists.lm-sensors.org/pipermail/lm-sensors/attachments/20060418/c2de2d8c/logpart.bz2
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (8 preceding siblings ...)
2006-04-18 18:43 ` Dieter Jurzitza
@ 2006-04-18 20:30 ` Rudolf Marek
2006-04-18 20:53 ` Dieter Jurzitza
10 siblings, 0 replies; 12+ messages in thread
From: Rudolf Marek @ 2006-04-18 20:30 UTC (permalink / raw)
To: lm-sensors
Dieter Jurzitza wrote:
> Dear Rudolf,
> dear listmembers,
> after building the kernel with debugging active, I flood my /var/log/messages
> file (something like 100MByte per day).
> What I get is attached. What is apparent to me is the fact that there are
> error messages every other time (I just attached several seconds) and non
> regular accesses (should happen every ten seconds) but no hang like before.
> So I have to wait a little more. If I am unlucky the delays introduced by the
> tons of printfs prevent sensors from crashing my sensor chip. But maybe some
> of you guys can understand why the errors are being reported.
> Thank you in advance,
> take care
Yep we can.
There are are errors reading addresses 0x4A and 0x4B
And 0x2C
options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49
force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0
So you left the second line there??? and only loading w83627hf with
force_subclients=0,0x2c,0x4a,0x4b ?
I'm asking because there are more addresses access
(cat logpart | awk -FADD '{print $2}' |awk '{print $1}' | sort -u)
Regards
Rudolf
^ permalink raw reply [flat|nested] 12+ messages in thread* [lm-sensors] Question on my mainboard
2006-03-28 19:50 [lm-sensors] Question on my mainboard Dieter Jurzitza
` (9 preceding siblings ...)
2006-04-18 20:30 ` Rudolf Marek
@ 2006-04-18 20:53 ` Dieter Jurzitza
10 siblings, 0 replies; 12+ messages in thread
From: Dieter Jurzitza @ 2006-04-18 20:53 UTC (permalink / raw)
To: lm-sensors
Dear Rudolf,
dear listmembers,
this is what I get when performing the awk - script Rudolf kindly provided
parsing /var/log/messages in total (~300 MByte):
\058,
\059,
\05a,
\05b,
\091,
\093,
\095,
\097,
\0a0,
\0a1,
\0a2,
\0a3,
\0a4,
\0a5,
\0a8,
\0a9,
> options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49
> force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0
>
> So you left the second line there??? and only loading w83627hf with
> force_subclients=0,0x2c,0x4a,0x4b ?
Yes, I left the second line there. I had had taken out the loading of the
w83627hf driver but sensors crashed anyway, so I thought it might be a good
start from where I had had come from - what may be wrong.
> I'm asking because there are more addresses access
This:
> (cat logpart | awk -FADD '{print $2}' |awk '{print $1}' | sort -u)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gives me the list as shown above.
I left that there because the folks from Tyan told me to use exactly this line
- and the last replies from the lm_sensors list referred to that I should
unload the w83627hf driver and did not refer to the "force_subclients=XXX"
lines. After removing the w83627hf driver there was no change, so I had had
put it back in.
Thanks for your efforts - I am not the I?C - man.
Take care
Dieter
--
-----------------------------------------------------------
|
\
/\_/\ |
| ~x~ |/-----\ /
\ /- \_/
^^__ _ / _ ____ /
<??__ \- \_/ | |/ | |
|| || _| _| _| _|
if you really want to see the pictures above - use some font
with constant spacing like courier! :-)
-----------------------------------------------------------
^ permalink raw reply [flat|nested] 12+ messages in thread