* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
@ 2005-05-19 6:23 ` phil
2005-05-19 6:23 ` lm_sensors Kyösti Mälkki
` (25 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: phil @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Just to throw my $.02 in, I think Kyosti is right to say the problem
isn't really ideally fixed, however we've made some significant
progress in making it fairly safe for known users with this chip.
As far as adding locking goes, I can't think of a reason why we need
locking to implement the I2C/SMBus protocol. It would be the safest
way to go to implement a completely reliable workaround for this
particular issue, but is it *right*? Unless we are going to need this
locking mechanism as a standard feature, I would opt to not implement
it. Are there are any other devices which require that consecutive
transations (not just with it, but all bus transactions) to be tightly
controlled? Do we have reason to believe that there may be more in
the future?
The nice thing with putting drivers in the kernel is that we can allow
everyone access to the bus w/o the risk of someone hogging control of
it. If we add locking, we potentially lose that if a user-space app
abuses the locking feature.
Phil
On Sat, Sep 07, 2002 at 02:18:18PM -0400, Mark D. Studebaker wrote:
> Ky?sti M?lkki wrote:
> >
> > On Thu, 5 Sep 2002, Mark D. Studebaker wrote:
> >
> > > Kyosti has described some ways in which the 24RF08 could still
> > > theoretically be corrupted (on non-IBM systems). While not disagreeing
> > > with him, I think we need to draw the line somewhere, and in my
> > > opinion we have a good explanation for Alan Cox that we have both
> > > blacklisted the IBM systems _AND_ fixed the actual problem on non-IBM
> > > systems, if there are any.
> >
> > I think we should not claim the actual problem fixed until adapter
> > lock is held between the two Write Quicks. Unless of course, if you
> > can convince Alan Cox that it hardly ever happens...
> >
> > 1. Module eeprom.o
> >
> > Generates Quick + multiple Read sequences if loaded with chksum!=0.
> >
>
> I verified that 'modprobe eeprom checksum=1' does corrupt the 24RF08.
> We should add a second write quick, but that doesn't solve your locking
> concern.
>
> > Lucky we are, even number of Quicks protects 24rf08 from being
> > corrupted by future transactions, but the lock issue applies here.
> > I do not know if loading another client "simultaneuosly" is safe.
> >
> > 2. Two Write Quicks are breaken apart, since adapter lock is released
> >
> > Only single Write Quick in i2cdetect - but even number of those, and
> > two Write Quicks in sensors-detect but releasing the bus in between.
> > Just running sensors-detect twice may be harmful. First run loads
> > sensor client that polls readings every five seconds or so.
> >
>
> We could double the write quicks in i2cdetect like we did in
> sensors-detect.
> But that doesn't solve your locking concern.
>
> So do you have a proposal for locking?
> Is there anyway to lock the adapter from userspace?
> If there is, that's preferable to hacking i2c-core to double/fake write
> quicks,
> in my opinion.
>
> > > I don't see how we can prevent corruption in a multi-master system.
> >
> > If 0x54-0x57 is known to be any eeprom, Write Byte could replace Write
> > Quick for the address probe.
--
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
phil@netroedge.com -- http://www.netroedge.com/~phil
PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
2005-05-19 6:23 ` lm_sensors phil
@ 2005-05-19 6:23 ` Kyösti Mälkki
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
` (24 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Kyösti Mälkki @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
On Thu, 5 Sep 2002, Mark D. Studebaker wrote:
> Kyosti has described some ways in which the 24RF08 could still
> theoretically be corrupted (on non-IBM systems). While not disagreeing
> with him, I think we need to draw the line somewhere, and in my
> opinion we have a good explanation for Alan Cox that we have both
> blacklisted the IBM systems _AND_ fixed the actual problem on non-IBM
> systems, if there are any.
I think we should not claim the actual problem fixed until adapter
lock is held between the two Write Quicks. Unless of course, if you
can convince Alan Cox that it hardly ever happens...
1. Module eeprom.o
Generates Quick + multiple Read sequences if loaded with chksum!=0.
Lucky we are, even number of Quicks protects 24rf08 from being
corrupted by future transactions, but the lock issue applies here.
I do not know if loading another client "simultaneuosly" is safe.
2. Two Write Quicks are breaken apart, since adapter lock is released
Only single Write Quick in i2cdetect - but even number of those, and
two Write Quicks in sensors-detect but releasing the bus in between.
Just running sensors-detect twice may be harmful. First run loads
sensor client that polls readings every five seconds or so.
> I don't see how we can prevent corruption in a multi-master system.
If 0x54-0x57 is known to be any eeprom, Write Byte could replace Write
Quick for the address probe.
--
Ky?sti M?lkki
kmalkki@cc.hut.fi
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
2005-05-19 6:23 ` lm_sensors phil
2005-05-19 6:23 ` lm_sensors Kyösti Mälkki
@ 2005-05-19 6:23 ` Mark D. Studebaker
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
` (23 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Mark D. Studebaker @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Thank you Pam for your long email.
I think your "understandings" #1-3 are correct,
as are your "solutions" #1-2.
Solution #1 is only interesting if you care to release
the interface to the hardware sensors. We don't really have
much interest in having people run lm_sensors just to
access the eeprom, for example. So if you would like to
release the information so that thinkpad users can access their
sensors under linux, great. If not, I don't think
there is a lot of demand for this. My opinion anyway.
Solution #2 is implemented now, in a rather coarse way.
We read the DMI information
in the BIOS and look for a "Vendor" string of "IBM"
in the "System Information Block".
This is implemented both in "sensors-detect" and in "i2c-piix4".
Solution #2 could obviously be improved if you give us a way
to identify systems that specifically have a Atmel 24RF08 eeprom
on them. It is our suspicion that more recent Thinkpads have
a standard eeprom (Atmel 24C08 or compatible). These are thought
not to be susceptible to corruption. The "MTM" info sounds
great - _if_ you can correlate MTM's to eeprom types!
We have also implemented a third solution.
Solution #3 works if solution #2 fails and is a true "root cause" fix.
The root cause is a quick write "0" in the chip range 0x54-0x57 (the
24RF08)
followed by a read from any chip at any address, which corrupts
the 24RF08 due to a bug ("feature"?) in that chip.
This sequence happens in our "sensors-detect" script (which also has the
solution #2 fix).
The script now follows any quick write "0" in the chip range 0x54-57
with a second quick write. This resets the 24RF08 state machine and
prevents corruption. This fix is tested and verified.
Kyosti has described some ways in which the 24RF08 could still
theoretically
be corrupted (on non-IBM systems). While not disagreeing with him,
I think we need to draw the line somewhere, and in my opinion we have
a good explanation for Alan Cox that we have both blacklisted the
IBM systems _AND_ fixed the actual problem on non-IBM systems, if there
are any.
I don't see how we can prevent corruption in a multi-master system.
So I propose leaving both solution #2 and solution #3 in place.
In fact Linus has accepted a patch for kernel 2.5.34 that exports
a variable to us to implement solution #2 in-kernel - this
patch was blessed by Alan Cox and it (together with the
explanation above) paves the way for inclusion
of lm_sensors in 2.5.
For Pam I have the following questions:
- Is our Solution #2 (DMI) a valid way of identifying IBM systems?
- Can you give us a way to identify systems that contain 24RF08's?
- Can you release to us the method for accessing hardware sensors?
- Do you have someone that can verify our solutions #2 and #3 on
a 24RF08-containing system running linux? If you give us a contact
we can send them code and instructions. Of course, worst case,
they would have to be prepared for corruption and be able to fix it.
thanks again for your help.
mds
phil@netroedge.com wrote:
>
> Hey Mark, do we need more specific information on detecting Thinkpads,
> or are we confident that we can work around the issue w/o needing to
> resort to blacklisting?
>
> One issue that concerns me (and Alan Cox) with the blacklisting is
> that we are assuming that the AT24RF08 won't be run into on other
> hardware (IBM Intellistations which are suggested to have chips like
> the 24RF08, or even non-IBM hardware).
>
> If we think we have a possible fix in place, perhaps Pam (and those on
> the thinkpad mailing list) can help confirm the fix? This would be
> preferable to the DMI detection and workaround mess (although you guys
> did some awesome work with that).
>
> Phil
>
> On Thu, Sep 05, 2002 at 01:56:44PM -0400, Pam Huntley wrote:
> >
> > Hi Phil,
> >
> > I have heard from the hardware engineers in Japan. They wanted me to
> > clarify some things with you, particularly the optimal solution you would
> > like, and what is tolerable.
> >
> > First I'd like to make sure I actually understand the problem, since I'm
> > not really a hardware person, and our ThinkPad hardware guys only speak
> > passable English. Below is what I understand, put together from your
> > emails and the hardware team's comments:
> > 1. lm_sensors is a software package for Linux that does health montoring
> > of hardware, soon to be added to the Linux kernel. It uses a wide variety
> > of sensors, including temperature, battery life, fan speed, voltages,
> > memory detection, etc. The typical PC has a chipset on the motherboard
> > which is usually accessed via the ISA bus or the SMBus, which is what
> > lm_sensors is coded to use.
> > 2. The sensor (the thermal sensor) in ThinkPad is not connected to SMBUS,
> > instead IBM normally uses an embedded controller to monitor thermal sensors
> > (sometimes using multiple sensors). However, the H/W implementation varies
> > depending on model. IBM does not disclose the interface to access to those
> > sensors.
> > 3. lm_sensors uses SMBUS to connect several different devices, and one
> > of them is ATMEL EEPROM, which contains machine serial or other
> > device/system vital information. lm_sensors accesses the EEPROM in a way
> > that causes it to be corrupt. To quote your recent email:
> > "We got some samples of the Atmel AT24RF08 chip, and we
> > were able to reproduce the corruption! In a nut-shell, this
> > particular chip has a broken I2C bus state-machine which can interpret
> > certain sequences of bus communications (including communications with
> > other unrelated chips) as being 'data write' commands which corrupt
> > the eeprom."
> > Then BIOS detects the error condition and posts the error code and the
> > machine needs to repair.
> >
> >
> > As far as solutions you'd like, my understanding is this:
> > 1. Optimal solution: you have the hardware specs, you know what chipsets
> > are involved, and you can access the information without blowing away the
> > eeprom.
> > 2. Minimal solution: you know how to detect IBM hardware, and disable lm
> > sensors on it.
> >
> > The hardware guys are suggesting you detect IBM ThinkPads specifically, and
> > are preparing a document for public release that would tell you how to do
> > this. Knowing how IBM works (legal reviews, etc), this may take a little
> > time, but at least it could allow lm_sensors to still run on the server
> > hardware that isn't broken.
> >
> > It seems to me that for lm_sensors to work flawlessly on all ThinkPads, you
> > would need to know all the different ways that the hardware engineers
> > implement their sensors, and how to access this information safely. Is
> > this correct? As far as I can tell, the ThinkPad hardware engineers are
> > very reluctant to release this information. The reason that was given to
> > me is that whenever they released the specs for their BIOS and related
> > hardware in the past, they got locked down to a particular implementation,
> > and were unable to change things without upsetting the people that were
> > relying on that particular design. However, they have been willing to
> > release some limited information recently, so if you do need this
> > information, we could at least ask.
> >
> > Please let me know your thoughts on all this. I'll tell the hardware guys
> > to proceed with the documentation on how to detect IBM ThinkPads. Whether
> > or not I persue more information with depends on your response.
> >
> > Thanks,
> > Pam
> >
> >
> > ======================
> > Pamela Huntley, IBM PCD Software Development
> > Phone: (919) 543-3598 Email: phuntley@us.ibm.com
> >
> >
> >
> >
> > phil@netroedge.co
> > m To: Pam Huntley/Raleigh/IBM@IBMUS,
> > sensors@Stimpy.netroedge.com
> > 08/30/2002 07:22 cc:
> > PM Subject: Re: lm sensors
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Aug 30, 2002 at 05:20:50PM -0400, Pam Huntley wrote:
> > >[...]
> > > I gave them your contact information (email). I haven't heard if they'll
> > > contact you directly or not, as they are in Japan, they might just send
> > me
> > > stuff and let me pass it on to you. I should know more next week when
> > they
> > > respond to my email.
> >
> > OK, sounds good.
> >
> > > As far as detecting IBM, I think you are on the right track. My
> > > understanding is that both the vendor flag "IBM" and the MTM (machine
> > type
> > > and model) are located in the BIOS, and that you can access this using
> > > SMAPI calls. We used to use DMI on our older machines, I'm not sure if
> > it
> > > will work on the newer ones. Again, I write mostly GUI software, so I'm
> > a
> > > little fuzzy on things like BIOS, but I can probably get more specifics
> > if
> > > this is something you need to know more about. I believe you can
> > actually
> > > get the MTM and just test the type to make sure it's a ThinkPad, and that
> > > way you won't have to disable it for ALL IBM machines. Hopefully we can
> > > get the specs to you and you won't have to disable it at all.
> >
> > I'm hoping we can identify the chip and work around the problem so we
> > don't have to blacklist anything. That would be ideal.
> >
> > > I'm hoping that we can get you what you need, I'll keep you posted as to
> > > what I know.
> >
> > Thanks!! We really appreciate your help. :')
> >
> >
> > Phil
> >
> > --
> > Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> > phil@netroedge.com -- http://www.netroedge.com/~phil
> > PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
> >
> >
> >
> >
>
> --
> Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> phil@netroedge.com -- http://www.netroedge.com/~phil
> PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (2 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
@ 2005-05-19 6:23 ` Mark D. Studebaker
2005-05-19 6:23 ` lm_sensors Mark Studebaker
` (22 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Mark D. Studebaker @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Ky?sti M?lkki wrote:
>
> On Thu, 5 Sep 2002, Mark D. Studebaker wrote:
>
> > Kyosti has described some ways in which the 24RF08 could still
> > theoretically be corrupted (on non-IBM systems). While not disagreeing
> > with him, I think we need to draw the line somewhere, and in my
> > opinion we have a good explanation for Alan Cox that we have both
> > blacklisted the IBM systems _AND_ fixed the actual problem on non-IBM
> > systems, if there are any.
>
> I think we should not claim the actual problem fixed until adapter
> lock is held between the two Write Quicks. Unless of course, if you
> can convince Alan Cox that it hardly ever happens...
>
> 1. Module eeprom.o
>
> Generates Quick + multiple Read sequences if loaded with chksum!=0.
>
I verified that 'modprobe eeprom checksum=1' does corrupt the 24RF08.
We should add a second write quick, but that doesn't solve your locking
concern.
> Lucky we are, even number of Quicks protects 24rf08 from being
> corrupted by future transactions, but the lock issue applies here.
> I do not know if loading another client "simultaneuosly" is safe.
>
> 2. Two Write Quicks are breaken apart, since adapter lock is released
>
> Only single Write Quick in i2cdetect - but even number of those, and
> two Write Quicks in sensors-detect but releasing the bus in between.
> Just running sensors-detect twice may be harmful. First run loads
> sensor client that polls readings every five seconds or so.
>
We could double the write quicks in i2cdetect like we did in
sensors-detect.
But that doesn't solve your locking concern.
So do you have a proposal for locking?
Is there anyway to lock the adapter from userspace?
If there is, that's preferable to hacking i2c-core to double/fake write
quicks,
in my opinion.
> > I don't see how we can prevent corruption in a multi-master system.
>
> If 0x54-0x57 is known to be any eeprom, Write Byte could replace Write
> Quick for the address probe.
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (3 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
@ 2005-05-19 6:23 ` Mark Studebaker
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
` (21 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Mark Studebaker @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
We have blacklisted all IBM systems now.
After we get info from you, and there is more testing,
we may remove the blacklisting, except for specific systems, or
remove it altogether.
We want to be very conservative at first, as a conservative
approach is a prerequisite to getting in the kernel.
Pam Huntley wrote:
>
> Hi Mark,
>
> For most of your questions, I've sent a note to our hardware engineers to
> get the correct answers from them.
>
> As far as DMI goes, for most ThinkPads you can tell if it is an IBM system
> this way. Our hardware team is preparing a detailed document that will let
> you tell which IBM ThinkPad it is, which should give you a clue about the
> chipset, and I'll see if we can also get the list of which ThinkPads use
> the 24RF08.
>
> I was a little unclear from your email regarding the blacklisting - do you
> mean that if you have a work around, you won't blacklist the IBM systems?
> Or you will still blacklist them even if you think you have a fix? As I
> said above, there should be a document soon to detect not just the IBM
> brand, but also if it is a ThinkPad, and if so, which one. Hopefully this
> will mean you do not have to blacklist IBM hardware that does not contain
> the 24RF08. Particularly I would like to avoid impacting IBM servers, if
> possible.
>
> As far as testing goes, I'll keep after my manager on it. I've also asked
> the hardware team in Japan if they have any engineers or hardware they can
> spare. Hopefully something will come up.
>
> Thanks for all the good work you guys are doing. Personally I'm a bit of a
> Linux fan, so I'm glad to help out in whatever way I can.
>
> Pam
>
> ======================
> Pamela Huntley, IBM PCD Software Development
> Phone: (919) 543-3598 Email: phuntley@us.ibm.com
>
>
> "Mark D.
> Studebaker" To: sensors@Stimpy.netroedge.com
> <mds@paradyne.com cc: Pam Huntley/Raleigh/IBM@IBMUS
> > Subject: Re: lm_sensors
> Sent by:
> mds@us.ibm.com
>
>
> 09/05/2002 10:41
> PM
>
>
>
> Thank you Pam for your long email.
> I think your "understandings" #1-3 are correct,
> as are your "solutions" #1-2.
>
> Solution #1 is only interesting if you care to release
> the interface to the hardware sensors. We don't really have
> much interest in having people run lm_sensors just to
> access the eeprom, for example. So if you would like to
> release the information so that thinkpad users can access their
> sensors under linux, great. If not, I don't think
> there is a lot of demand for this. My opinion anyway.
>
> Solution #2 is implemented now, in a rather coarse way.
> We read the DMI information
> in the BIOS and look for a "Vendor" string of "IBM"
> in the "System Information Block".
> This is implemented both in "sensors-detect" and in "i2c-piix4".
>
> Solution #2 could obviously be improved if you give us a way
> to identify systems that specifically have a Atmel 24RF08 eeprom
> on them. It is our suspicion that more recent Thinkpads have
> a standard eeprom (Atmel 24C08 or compatible). These are thought
> not to be susceptible to corruption. The "MTM" info sounds
> great - _if_ you can correlate MTM's to eeprom types!
>
> We have also implemented a third solution.
> Solution #3 works if solution #2 fails and is a true "root cause" fix.
> The root cause is a quick write "0" in the chip range 0x54-0x57 (the
> 24RF08)
> followed by a read from any chip at any address, which corrupts
> the 24RF08 due to a bug ("feature"?) in that chip.
> This sequence happens in our "sensors-detect" script (which also has the
> solution #2 fix).
> The script now follows any quick write "0" in the chip range 0x54-57
> with a second quick write. This resets the 24RF08 state machine and
> prevents corruption. This fix is tested and verified.
>
> Kyosti has described some ways in which the 24RF08 could still
> theoretically
> be corrupted (on non-IBM systems). While not disagreeing with him,
> I think we need to draw the line somewhere, and in my opinion we have
> a good explanation for Alan Cox that we have both blacklisted the
> IBM systems _AND_ fixed the actual problem on non-IBM systems, if there
> are any.
> I don't see how we can prevent corruption in a multi-master system.
>
> So I propose leaving both solution #2 and solution #3 in place.
> In fact Linus has accepted a patch for kernel 2.5.34 that exports
> a variable to us to implement solution #2 in-kernel - this
> patch was blessed by Alan Cox and it (together with the
> explanation above) paves the way for inclusion
> of lm_sensors in 2.5.
>
> For Pam I have the following questions:
>
> - Is our Solution #2 (DMI) a valid way of identifying IBM systems?
> - Can you give us a way to identify systems that contain 24RF08's?
> - Can you release to us the method for accessing hardware sensors?
> - Do you have someone that can verify our solutions #2 and #3 on
> a 24RF08-containing system running linux? If you give us a contact
> we can send them code and instructions. Of course, worst case,
> they would have to be prepared for corruption and be able to fix it.
>
> thanks again for your help.
> mds
>
> phil@netroedge.com wrote:
> >
> > Hey Mark, do we need more specific information on detecting Thinkpads,
> > or are we confident that we can work around the issue w/o needing to
> > resort to blacklisting?
> >
> > One issue that concerns me (and Alan Cox) with the blacklisting is
> > that we are assuming that the AT24RF08 won't be run into on other
> > hardware (IBM Intellistations which are suggested to have chips like
> > the 24RF08, or even non-IBM hardware).
> >
> > If we think we have a possible fix in place, perhaps Pam (and those on
> > the thinkpad mailing list) can help confirm the fix? This would be
> > preferable to the DMI detection and workaround mess (although you guys
> > did some awesome work with that).
> >
> > Phil
> >
> > On Thu, Sep 05, 2002 at 01:56:44PM -0400, Pam Huntley wrote:
> > >
> > > Hi Phil,
> > >
> > > I have heard from the hardware engineers in Japan. They wanted me to
> > > clarify some things with you, particularly the optimal solution you
> would
> > > like, and what is tolerable.
> > >
> > > First I'd like to make sure I actually understand the problem, since
> I'm
> > > not really a hardware person, and our ThinkPad hardware guys only speak
> > > passable English. Below is what I understand, put together from your
> > > emails and the hardware team's comments:
> > > 1. lm_sensors is a software package for Linux that does health
> montoring
> > > of hardware, soon to be added to the Linux kernel. It uses a wide
> variety
> > > of sensors, including temperature, battery life, fan speed, voltages,
> > > memory detection, etc. The typical PC has a chipset on the motherboard
> > > which is usually accessed via the ISA bus or the SMBus, which is what
> > > lm_sensors is coded to use.
> > > 2. The sensor (the thermal sensor) in ThinkPad is not connected to
> SMBUS,
> > > instead IBM normally uses an embedded controller to monitor thermal
> sensors
> > > (sometimes using multiple sensors). However, the H/W implementation
> varies
> > > depending on model. IBM does not disclose the interface to access to
> those
> > > sensors.
> > > 3. lm_sensors uses SMBUS to connect several different devices, and
> one
> > > of them is ATMEL EEPROM, which contains machine serial or other
> > > device/system vital information. lm_sensors accesses the EEPROM in a
> way
> > > that causes it to be corrupt. To quote your recent email:
> > > "We got some samples of the Atmel AT24RF08 chip, and we
> > > were able to reproduce the corruption! In a nut-shell, this
> > > particular chip has a broken I2C bus state-machine which can interpret
> > > certain sequences of bus communications (including communications with
> > > other unrelated chips) as being 'data write' commands which corrupt
> > > the eeprom."
> > > Then BIOS detects the error condition and posts the error code and the
> > > machine needs to repair.
> > >
> > >
> > > As far as solutions you'd like, my understanding is this:
> > > 1. Optimal solution: you have the hardware specs, you know what
> chipsets
> > > are involved, and you can access the information without blowing away
> the
> > > eeprom.
> > > 2. Minimal solution: you know how to detect IBM hardware, and disable
> lm
> > > sensors on it.
> > >
> > > The hardware guys are suggesting you detect IBM ThinkPads specifically,
> and
> > > are preparing a document for public release that would tell you how to
> do
> > > this. Knowing how IBM works (legal reviews, etc), this may take a
> little
> > > time, but at least it could allow lm_sensors to still run on the server
> > > hardware that isn't broken.
> > >
> > > It seems to me that for lm_sensors to work flawlessly on all ThinkPads,
> you
> > > would need to know all the different ways that the hardware engineers
> > > implement their sensors, and how to access this information safely. Is
> > > this correct? As far as I can tell, the ThinkPad hardware engineers
> are
> > > very reluctant to release this information. The reason that was given
> to
> > > me is that whenever they released the specs for their BIOS and related
> > > hardware in the past, they got locked down to a particular
> implementation,
> > > and were unable to change things without upsetting the people that were
> > > relying on that particular design. However, they have been willing
> to
> > > release some limited information recently, so if you do need this
> > > information, we could at least ask.
> > >
> > > Please let me know your thoughts on all this. I'll tell the hardware
> guys
> > > to proceed with the documentation on how to detect IBM ThinkPads.
> Whether
> > > or not I persue more information with depends on your response.
> > >
> > > Thanks,
> > > Pam
> > >
> > >
> > > ======================
> > > Pamela Huntley, IBM PCD Software Development
> > > Phone: (919) 543-3598 Email: phuntley@us.ibm.com
> > >
> > >
> > >
> > >
> > > phil@netroedge.co
> > > m To: Pam
> Huntley/Raleigh/IBM@IBMUS,
> > >
> sensors@Stimpy.netroedge.com
> > > 08/30/2002 07:22 cc:
> > > PM Subject: Re: lm sensors
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Aug 30, 2002 at 05:20:50PM -0400, Pam Huntley wrote:
> > > >[...]
> > > > I gave them your contact information (email). I haven't heard if
> they'll
> > > > contact you directly or not, as they are in Japan, they might just
> send
> > > me
> > > > stuff and let me pass it on to you. I should know more next week when
> > > they
> > > > respond to my email.
> > >
> > > OK, sounds good.
> > >
> > > > As far as detecting IBM, I think you are on the right track. My
> > > > understanding is that both the vendor flag "IBM" and the MTM (machine
> > > type
> > > > and model) are located in the BIOS, and that you can access this
> using
> > > > SMAPI calls. We used to use DMI on our older machines, I'm not sure
> if
> > > it
> > > > will work on the newer ones. Again, I write mostly GUI software, so
> I'm
> > > a
> > > > little fuzzy on things like BIOS, but I can probably get more
> specifics
> > > if
> > > > this is something you need to know more about. I believe you can
> > > actually
> > > > get the MTM and just test the type to make sure it's a ThinkPad, and
> that
> > > > way you won't have to disable it for ALL IBM machines. Hopefully we
> can
> > > > get the specs to you and you won't have to disable it at all.
> > >
> > > I'm hoping we can identify the chip and work around the problem so we
> > > don't have to blacklist anything. That would be ideal.
> > >
> > > > I'm hoping that we can get you what you need, I'll keep you posted as
> to
> > > > what I know.
> > >
> > > Thanks!! We really appreciate your help. :')
> > >
> > >
> > > Phil
> > >
> > > --
> > > Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> > > phil@netroedge.com -- http://www.netroedge.com/~phil
> > > PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
> > >
> > >
> > >
> > >
> >
> > --
> > Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> > phil@netroedge.com -- http://www.netroedge.com/~phil
> > PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (4 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Mark Studebaker
@ 2005-05-19 6:23 ` Mark D. Studebaker
2005-05-19 6:23 ` lm_sensors phil
` (20 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Mark D. Studebaker @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Here's my proposal.
I would like to do the 2.6.5 release as-is this week and then submit it
to Alan,
for two reasons:
- what we have in CVS is much safer for Thinkpad and 24RF08 users than
what
is in 2.6.4; we should make it available even if it is an incremental
improvement and not a perfect fix;
- it's much easier for tracking if we submit releases rather than CVS
stuff to Alan.
We are all agreed that what we have is not bulletproof and that we
shouldn't
claim it is.
Let's plan on addressing whatever concerns Alan may have, together
with incorporating any further info we get from IBM, in 2.6.6.
Also in that release incorporating locking or i2c-core tweaks if
that makes sense.
As long as we are working with Alan and trying to get into 2.5 we should
try and have a release interval of about 4 weeks or less so we can be
responsive.
thoughts?
phil@netroedge.com wrote:
>
> On Mon, Sep 09, 2002 at 11:18:44AM +0300, Ky?sti M?lkki wrote:
> >[...]
> >
> > However I do not consider current situation very satisfactory:
> >
> > 1. Blacklisting by DMI data does not catch new models beforehand.
>
> Pam claims all current and future models will not use this chip (at
> least on Thinkpads).
>
> > 2. IBM systems with other than PIIX4 are compromised if admin does not
> > run and/or believe sensors-detect warnings and is unaware of the issue.
>
> Yes, there are still configurations which exist which make it possible
> for this problem to crop up again. I worry particularly about other
> chips which may react negatively to probing.
>
> > 3. If #1 or #2 happens, every existing client driver probing around
> > 0x54-0x57 needs a workaround. Including video4linux(2) drivers from
> > several sources, and any app using the char device.
> >
> > I think patching i2c-core to add extra Write Quick within the critical
> > section is safe and easy way to handle those issues. This leaves only
> > multi-master topologies vulnerable. What do the SMBus specs say, can a
> > laptop share SMBus with a docking station, charger etc ?
>
> I'm wary of changing the code at quite that low of a level. I think
> it's good we tweaked sensors-detect to do what it can to work around
> the issue. It's also nice to have the blacklist in effect as a
> temporary measure until testing is completed.
>
> Past that, I think we should stay out of most kernel code (with
> exception to the blacklisting code). What we most definitely do not
> want to do is change the operation of the code to make it work in an
> unexpected way for developers (e.g. having quick writes get duplicated
> for certain addresses and such).
>
> It's a balance between practicality and technical 'correctness'. I'd
> like to keep things as technically correct as possible. If a chip or
> mobo has a broken design, then it's the problem of the manufacturer.
> In practicality, it is nessesary for us to do what we can to make it
> safe for users, but I don't want to move away from having the code
> work as expected simply to work around someone else's broken hardware
> design.
>
> Phil
>
> --
> Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> phil@netroedge.com -- http://www.netroedge.com/~phil
> PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (5 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Mark D. Studebaker
@ 2005-05-19 6:23 ` phil
2005-05-19 6:23 ` lm_sensors phil
` (19 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: phil @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Hey Mark, do we need more specific information on detecting Thinkpads,
or are we confident that we can work around the issue w/o needing to
resort to blacklisting?
One issue that concerns me (and Alan Cox) with the blacklisting is
that we are assuming that the AT24RF08 won't be run into on other
hardware (IBM Intellistations which are suggested to have chips like
the 24RF08, or even non-IBM hardware).
If we think we have a possible fix in place, perhaps Pam (and those on
the thinkpad mailing list) can help confirm the fix? This would be
preferable to the DMI detection and workaround mess (although you guys
did some awesome work with that).
Phil
On Thu, Sep 05, 2002 at 01:56:44PM -0400, Pam Huntley wrote:
>
> Hi Phil,
>
> I have heard from the hardware engineers in Japan. They wanted me to
> clarify some things with you, particularly the optimal solution you would
> like, and what is tolerable.
>
> First I'd like to make sure I actually understand the problem, since I'm
> not really a hardware person, and our ThinkPad hardware guys only speak
> passable English. Below is what I understand, put together from your
> emails and the hardware team's comments:
> 1. lm_sensors is a software package for Linux that does health montoring
> of hardware, soon to be added to the Linux kernel. It uses a wide variety
> of sensors, including temperature, battery life, fan speed, voltages,
> memory detection, etc. The typical PC has a chipset on the motherboard
> which is usually accessed via the ISA bus or the SMBus, which is what
> lm_sensors is coded to use.
> 2. The sensor (the thermal sensor) in ThinkPad is not connected to SMBUS,
> instead IBM normally uses an embedded controller to monitor thermal sensors
> (sometimes using multiple sensors). However, the H/W implementation varies
> depending on model. IBM does not disclose the interface to access to those
> sensors.
> 3. lm_sensors uses SMBUS to connect several different devices, and one
> of them is ATMEL EEPROM, which contains machine serial or other
> device/system vital information. lm_sensors accesses the EEPROM in a way
> that causes it to be corrupt. To quote your recent email:
> "We got some samples of the Atmel AT24RF08 chip, and we
> were able to reproduce the corruption! In a nut-shell, this
> particular chip has a broken I2C bus state-machine which can interpret
> certain sequences of bus communications (including communications with
> other unrelated chips) as being 'data write' commands which corrupt
> the eeprom."
> Then BIOS detects the error condition and posts the error code and the
> machine needs to repair.
>
>
> As far as solutions you'd like, my understanding is this:
> 1. Optimal solution: you have the hardware specs, you know what chipsets
> are involved, and you can access the information without blowing away the
> eeprom.
> 2. Minimal solution: you know how to detect IBM hardware, and disable lm
> sensors on it.
>
> The hardware guys are suggesting you detect IBM ThinkPads specifically, and
> are preparing a document for public release that would tell you how to do
> this. Knowing how IBM works (legal reviews, etc), this may take a little
> time, but at least it could allow lm_sensors to still run on the server
> hardware that isn't broken.
>
> It seems to me that for lm_sensors to work flawlessly on all ThinkPads, you
> would need to know all the different ways that the hardware engineers
> implement their sensors, and how to access this information safely. Is
> this correct? As far as I can tell, the ThinkPad hardware engineers are
> very reluctant to release this information. The reason that was given to
> me is that whenever they released the specs for their BIOS and related
> hardware in the past, they got locked down to a particular implementation,
> and were unable to change things without upsetting the people that were
> relying on that particular design. However, they have been willing to
> release some limited information recently, so if you do need this
> information, we could at least ask.
>
> Please let me know your thoughts on all this. I'll tell the hardware guys
> to proceed with the documentation on how to detect IBM ThinkPads. Whether
> or not I persue more information with depends on your response.
>
> Thanks,
> Pam
>
>
> ======================
> Pamela Huntley, IBM PCD Software Development
> Phone: (919) 543-3598 Email: phuntley@us.ibm.com
>
>
>
>
> phil@netroedge.co
> m To: Pam Huntley/Raleigh/IBM@IBMUS,
> sensors@Stimpy.netroedge.com
> 08/30/2002 07:22 cc:
> PM Subject: Re: lm sensors
>
>
>
>
>
>
>
> On Fri, Aug 30, 2002 at 05:20:50PM -0400, Pam Huntley wrote:
> >[...]
> > I gave them your contact information (email). I haven't heard if they'll
> > contact you directly or not, as they are in Japan, they might just send
> me
> > stuff and let me pass it on to you. I should know more next week when
> they
> > respond to my email.
>
> OK, sounds good.
>
> > As far as detecting IBM, I think you are on the right track. My
> > understanding is that both the vendor flag "IBM" and the MTM (machine
> type
> > and model) are located in the BIOS, and that you can access this using
> > SMAPI calls. We used to use DMI on our older machines, I'm not sure if
> it
> > will work on the newer ones. Again, I write mostly GUI software, so I'm
> a
> > little fuzzy on things like BIOS, but I can probably get more specifics
> if
> > this is something you need to know more about. I believe you can
> actually
> > get the MTM and just test the type to make sure it's a ThinkPad, and that
> > way you won't have to disable it for ALL IBM machines. Hopefully we can
> > get the specs to you and you won't have to disable it at all.
>
> I'm hoping we can identify the chip and work around the problem so we
> don't have to blacklist anything. That would be ideal.
>
> > I'm hoping that we can get you what you need, I'll keep you posted as to
> > what I know.
>
> Thanks!! We really appreciate your help. :')
>
>
> Phil
>
> --
> Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> phil@netroedge.com -- http://www.netroedge.com/~phil
> PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
>
>
>
>
--
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
phil@netroedge.com -- http://www.netroedge.com/~phil
PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (6 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors phil
@ 2005-05-19 6:23 ` phil
2005-05-19 6:23 ` lm_sensors Kyösti Mälkki
` (18 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: phil @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
On Mon, Sep 09, 2002 at 11:18:44AM +0300, Ky?sti M?lkki wrote:
>[...]
>
> However I do not consider current situation very satisfactory:
>
> 1. Blacklisting by DMI data does not catch new models beforehand.
Pam claims all current and future models will not use this chip (at
least on Thinkpads).
> 2. IBM systems with other than PIIX4 are compromised if admin does not
> run and/or believe sensors-detect warnings and is unaware of the issue.
Yes, there are still configurations which exist which make it possible
for this problem to crop up again. I worry particularly about other
chips which may react negatively to probing.
> 3. If #1 or #2 happens, every existing client driver probing around
> 0x54-0x57 needs a workaround. Including video4linux(2) drivers from
> several sources, and any app using the char device.
>
> I think patching i2c-core to add extra Write Quick within the critical
> section is safe and easy way to handle those issues. This leaves only
> multi-master topologies vulnerable. What do the SMBus specs say, can a
> laptop share SMBus with a docking station, charger etc ?
I'm wary of changing the code at quite that low of a level. I think
it's good we tweaked sensors-detect to do what it can to work around
the issue. It's also nice to have the blacklist in effect as a
temporary measure until testing is completed.
Past that, I think we should stay out of most kernel code (with
exception to the blacklisting code). What we most definitely do not
want to do is change the operation of the code to make it work in an
unexpected way for developers (e.g. having quick writes get duplicated
for certain addresses and such).
It's a balance between practicality and technical 'correctness'. I'd
like to keep things as technically correct as possible. If a chip or
mobo has a broken design, then it's the problem of the manufacturer.
In practicality, it is nessesary for us to do what we can to make it
safe for users, but I don't want to move away from having the code
work as expected simply to work around someone else's broken hardware
design.
Phil
--
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
phil@netroedge.com -- http://www.netroedge.com/~phil
PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (7 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors phil
@ 2005-05-19 6:23 ` Kyösti Mälkki
2005-05-19 6:23 ` lm_sensors phil
` (17 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Kyösti Mälkki @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
On Sun, 8 Sep 2002 phil@netroedge.com wrote:
>
> Just to throw my $.02 in, I think Kyosti is right to say the problem
> isn't really ideally fixed, however we've made some significant
> progress in making it fairly safe for known users with this chip.
Agreed.
> As far as adding locking goes, I can't think of a reason why we need
> locking to implement the I2C/SMBus protocol. It would be the safest
> way to go to implement a completely reliable workaround for this
> particular issue, but is it *right*? Unless we are going to need this
> locking mechanism as a standard feature, I would opt to not implement
> it. Are there are any other devices which require that consecutive
> transations (not just with it, but all bus transactions) to be tightly
> controlled? Do we have reason to believe that there may be more in
> the future?
The special Start and Stop sequences quarantee mutual exclusion even on
multi-master buses. Eg. Read Byte/Word sequences do not send Stop
between selecting register and reading contents. There should be no need
to promote further bus locking for user-space.
> The nice thing with putting drivers in the kernel is that we can allow
> everyone access to the bus w/o the risk of someone hogging control of
> it. If we add locking, we potentially lose that if a user-space app
> abuses the locking feature.
I agree.
However I do not consider current situation very satisfactory:
1. Blacklisting by DMI data does not catch new models beforehand.
2. IBM systems with other than PIIX4 are compromised if admin does not
run and/or believe sensors-detect warnings and is unaware of the issue.
3. If #1 or #2 happens, every existing client driver probing around
0x54-0x57 needs a workaround. Including video4linux(2) drivers from
several sources, and any app using the char device.
I think patching i2c-core to add extra Write Quick within the critical
section is safe and easy way to handle those issues. This leaves only
multi-master topologies vulnerable. What do the SMBus specs say, can a
laptop share SMBus with a docking station, charger etc ?
--
Ky?sti M?lkki
kmalkki@cc.hut.fi
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (8 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Kyösti Mälkki
@ 2005-05-19 6:23 ` phil
2005-05-19 6:23 ` lm_sensors DB Troll
` (16 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: phil @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Sounds good to me.
Phil
On Mon, Sep 09, 2002 at 08:42:20PM -0400, Mark D. Studebaker wrote:
> Here's my proposal.
>
> I would like to do the 2.6.5 release as-is this week and then submit it
> to Alan,
> for two reasons:
> - what we have in CVS is much safer for Thinkpad and 24RF08 users than
> what
> is in 2.6.4; we should make it available even if it is an incremental
> improvement and not a perfect fix;
> - it's much easier for tracking if we submit releases rather than CVS
> stuff to Alan.
>
> We are all agreed that what we have is not bulletproof and that we
> shouldn't
> claim it is.
>
> Let's plan on addressing whatever concerns Alan may have, together
> with incorporating any further info we get from IBM, in 2.6.6.
> Also in that release incorporating locking or i2c-core tweaks if
> that makes sense.
>
> As long as we are working with Alan and trying to get into 2.5 we should
> try and have a release interval of about 4 weeks or less so we can be
> responsive.
>
> thoughts?
>
> phil@netroedge.com wrote:
> >
> > On Mon, Sep 09, 2002 at 11:18:44AM +0300, Ky?sti M?lkki wrote:
> > >[...]
> > >
> > > However I do not consider current situation very satisfactory:
> > >
> > > 1. Blacklisting by DMI data does not catch new models beforehand.
> >
> > Pam claims all current and future models will not use this chip (at
> > least on Thinkpads).
> >
> > > 2. IBM systems with other than PIIX4 are compromised if admin does not
> > > run and/or believe sensors-detect warnings and is unaware of the issue.
> >
> > Yes, there are still configurations which exist which make it possible
> > for this problem to crop up again. I worry particularly about other
> > chips which may react negatively to probing.
> >
> > > 3. If #1 or #2 happens, every existing client driver probing around
> > > 0x54-0x57 needs a workaround. Including video4linux(2) drivers from
> > > several sources, and any app using the char device.
> > >
> > > I think patching i2c-core to add extra Write Quick within the critical
> > > section is safe and easy way to handle those issues. This leaves only
> > > multi-master topologies vulnerable. What do the SMBus specs say, can a
> > > laptop share SMBus with a docking station, charger etc ?
> >
> > I'm wary of changing the code at quite that low of a level. I think
> > it's good we tweaked sensors-detect to do what it can to work around
> > the issue. It's also nice to have the blacklist in effect as a
> > temporary measure until testing is completed.
> >
> > Past that, I think we should stay out of most kernel code (with
> > exception to the blacklisting code). What we most definitely do not
> > want to do is change the operation of the code to make it work in an
> > unexpected way for developers (e.g. having quick writes get duplicated
> > for certain addresses and such).
> >
> > It's a balance between practicality and technical 'correctness'. I'd
> > like to keep things as technically correct as possible. If a chip or
> > mobo has a broken design, then it's the problem of the manufacturer.
> > In practicality, it is nessesary for us to do what we can to make it
> > safe for users, but I don't want to move away from having the code
> > work as expected simply to work around someone else's broken hardware
> > design.
> >
> > Phil
> >
> > --
> > Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
> > phil@netroedge.com -- http://www.netroedge.com/~phil
> > PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
--
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
phil@netroedge.com -- http://www.netroedge.com/~phil
PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (9 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors phil
@ 2005-05-19 6:23 ` DB Troll
2005-05-19 6:23 ` lm_sensors Albert Kaan
` (15 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: DB Troll @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Could someone please tell me how to get this package to start at boot
and also to show the voltages in gkrellm.
TIA
David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 252 bytes
Desc: not available
Url : http://lists.lm-sensors.org/pipermail/lm-sensors/attachments/20030120/8414fb21/attachment.bin
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (10 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors DB Troll
@ 2005-05-19 6:23 ` Albert Kaan
2005-05-19 6:23 ` lm_sensors Jani Partanen
` (14 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Albert Kaan @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Hi,
I'm trying to get lm_sensors version 2.7.0 up and running with a Gigabyte GA-8LD533 board.
This board is equiped with a Intel PCH4 chip which is a ITE8712F-A connected to the Intel PCH4 (according to the manual) however it's not found by sensors-detect. Any clues?
This is the output from sensors-detect:
Next adapter: SMBus I801 adapter at 5000 (Non-I2C SMBus adapter)
Do you want to scan it? (YES/no/selectively):
Client found at address 0x08
Client found at address 0x30
Client found at address 0x44
Client found at address 0x50
Probing for `Serial EEPROM'... Success!
(confidence 1, driver `eeprom')
Probing for `DDC monitor'... Failed!
Client found at address 0x69
A isadump report this
[root@localhost root]# isadump 0x295 0x296
WARNING! Running this program can cause system crashes, data loss and worse!
I will probe address register 0x0295 and data register 0x0296.
You have five seconds to reconsider and press CTRL-C!
0 1 2 3 4 5 6 7 8 9 a b c d e f
00: 13 10 00 00 37 ff 00 37 ff 07 0d 5b 5b 4c ff ff
10: ff ff ff 50 00 00 00 00 00 00 00 00 00 00 00 00
20: 5d 5b d0 c0 af 62 b3 b7 ff 19 c9 12 12 12 12 12
30: ff 00 ff 00 ff 00 ff 00 ff 00 ff 00 ff 00 ff 00
40: 7f 7f 7f 7f 3c 7f ff ff 2d ff ff ff ff ff ff ff
50: ff 1c 7f 7f 3c ff ff ff 90 5c fb 12 55 00 00 00
60: 7f 7f 7f 7f 7f 00 00 00 7f 7f 7f 7f 7f 00 00 00
70: 7f 7f 7f 7f 00 00 00 00 ff ff ff ff ff ff ff ff
80: 13 10 00 00 37 ff 00 37 ff 07 0d 5b 5b 4c ff ff
90: ff ff ff 50 00 00 00 00 00 00 00 00 00 00 00 00
a0: 5d 5b d0 c0 af 62 b3 b7 ff 19 c9 12 12 12 12 12
b0: ff 00 ff 00 ff 00 ff 00 ff 00 ff 00 ff 00 ff 00
c0: 7f 7f 7f 7f 3c 7f ff ff 2d ff ff ff ff ff ff ff
d0: ff 1c 7f 7f 3c ff ff ff 90 5c fb 12 55 00 00 00
e0: 7f 7f 7f 7f 7f 00 00 00 7f 7f 7f 7f 7f 00 00 00
f0: 7f 7f 7f 7f 00 00 00 00 ff ff ff ff ff ff ff ff
[root@localhost root]#
Thanks for your help.
Best regards,
Albert Kaan
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (11 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Albert Kaan
@ 2005-05-19 6:23 ` Jani Partanen
2005-05-19 6:24 ` lm_sensors Jean Delvare
` (13 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jani Partanen @ 2005-05-19 6:23 UTC (permalink / raw)
To: lm-sensors
Hi!
I have problem with lm_sensors 2.7.0
i2c-piix4.o: Bus collision! SMBus may be locked until next hard
reset. (sorry!)
I get that a lot. Sometimes it give correct readings and but not allways.
And once it have run my machine to sleep or standby mode when I did
sensors... Ask me more info what you need and I give.
Bye!
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (12 preceding siblings ...)
2005-05-19 6:23 ` lm_sensors Jani Partanen
@ 2005-05-19 6:24 ` Jean Delvare
2005-05-19 6:24 ` lm_sensors Dimitri Kouznetsov
` (12 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
Some months ago, you reported that you were trying to get lm_sensors
2.7.0 to work with your Gigabyte GA-8LD533. It looks like we never
answered, I guess we were busy or something. Please apologize. Did you
finally got it to work? If not, let us know and we'll try to help.
--
Jean Delvare
http://www.ensicaen.ismra.fr/~delvare/
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (13 preceding siblings ...)
2005-05-19 6:24 ` lm_sensors Jean Delvare
@ 2005-05-19 6:24 ` Dimitri Kouznetsov
2005-05-19 6:24 ` lm_sensors Jean Delvare
` (11 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Dimitri Kouznetsov @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
Bios ACPI : PTLTD
Thinkpad T22
DMI : 1GET31WW (1.11)
It does not want to go, saying that ibm systems are known to be unstable with
lm_sensors.
No chance for me :)
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (14 preceding siblings ...)
2005-05-19 6:24 ` lm_sensors Dimitri Kouznetsov
@ 2005-05-19 6:24 ` Jean Delvare
2005-05-19 6:24 ` Lm_sensors Axel Thimm
` (10 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
> Bios ACPI : PTLTD
> Thinkpad T22
> DMI : 1GET31WW (1.11)
Could you please download and run vpddecode, from the dmidecode package?
http://savannah.nongnu.org/files/?group=dmidecode
This is a new tool for IBM systems identification. I plan to use it to
allow lm_sensors to be used on known-to-be-safe Thinkpads, but it hasn't
been tested heavily yet.
> It does not want to go, saying that ibm systems are known to be
> unstable with lm_sensors.
> No chance for me :)
The T22 is one of these Thinkpads with a problematic chip on it. I
suggest you forget about lm_sensors, unless you like playing with fire.
--
Jean Delvare
http://www.ensicaen.ismra.fr/~delvare/
^ permalink raw reply [flat|nested] 31+ messages in thread* Lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (15 preceding siblings ...)
2005-05-19 6:24 ` lm_sensors Jean Delvare
@ 2005-05-19 6:24 ` Axel Thimm
2005-05-19 6:24 ` Lm_sensors Jean Delvare
` (9 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Axel Thimm @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
Hi Den,
On Wed, Jan 14, 2004 at 11:56:21AM +0300, Den wrote:
> Hello Axell!
> I'm make patch to add support lm90 and asb100 sensors into fedora
> kernel. It's work fine for me on Gigabyte Ga-7vaxp (lm90 cpu sensor)
> and Asus P4pe (asb100). Patch based on standart lm_sensors patch. I'm
> testing this patch on 2135, 2138, 2140 and 2149 kernel, all ok. Patch
> replace your lm_sensors path. I'm think, support for this modern sensors
> is needed for community.
> Sorry for bad english.
> DenV <den@nekto.com>
> (some rpms for Fedora core http://den.tourinfo.ru/pack)
On Wed, Jan 14, 2004 at 12:06:10PM +0300, Den wrote:
> Hello Axell!
> PS. This patch is need 2 lines in configs:
> CONFIG_SENSORS_ASB100=m
> CONFIG_SENSORS_LM90=m
>
> DenV <den@nekto.com>
thanks for the feedback and the testing, they are probably best to be
shared with the lm_sensors list (Cced).
These two drivers are built by the lm_sensors project only when doing
the kernel modules build option, as they are considered
experimental. lm_sensors's patching method is more conservative.
In order to satisfy both conservatives and experimentalists, I provide
both a kernel with the lm_sensors patch, as well as lm_sensors modules
build out of the tree (but against the lm_sensors patched tree).
E.g. install a matching lm_sensors-kmdl additional to the
lm_sensors/i2c enebaled kernel, e.g.
lm_sensors-kmdl-2.4.22-1.2140.nptl_30.rhfc1.at-2.8.2-0_19.rhfc1.at.i686.rpm
for the 2.4.22-1.2140.nptl_30.rhfc1.at kernel on an i686.
Could you test whether these rpms work for you? Thanks!
--
Axel.Thimm@physik.fu-berlin.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.lm-sensors.org/pipermail/lm-sensors/attachments/20040114/d9922dc7/attachment.bin
^ permalink raw reply [flat|nested] 31+ messages in thread* Lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (16 preceding siblings ...)
2005-05-19 6:24 ` Lm_sensors Axel Thimm
@ 2005-05-19 6:24 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (8 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
> > Hello Axell!
> > I'm make patch to add support lm90 and asb100 sensors into fedora
> > kernel. It's work fine for me on Gigabyte Ga-7vaxp (lm90 cpu
> > sensor) and Asus P4pe (asb100). Patch based on standart lm_sensors
> > patch. I'm testing this patch on 2135, 2138, 2140 and 2149 kernel,
> > all ok. Patch replace your lm_sensors path. I'm think, support for
> > this modern sensors is needed for community.
> > Sorry for bad english.
> > DenV <den@nekto.com>
> > (some rpms for Fedora core http://den.tourinfo.ru/pack)
>
> On Wed, Jan 14, 2004 at 12:06:10PM +0300, Den wrote:
> > Hello Axell!
> > PS. This patch is need 2 lines in configs:
> > CONFIG_SENSORS_ASB100=m
> > CONFIG_SENSORS_LM90=m
> >
> > DenV <den@nekto.com>
>
> thanks for the feedback and the testing, they are probably best to be
> shared with the lm_sensors list (Cced).
>
> These two drivers are built by the lm_sensors project only when doing
> the kernel modules build option, as they are considered
> experimental. lm_sensors's patching method is more conservative.
This is about to change. It was decided (oh well, *I* decided and nobody
complained so far) that we would now include all drivers with both
methods. There's no reason that I can see to do it differently.
Not all drivers have been included back into the "patch the kernel tree"
method yet, but in the long term they should.
I've been adding five drivers today (lm83, lm90, asb100, max6650 and
w83l785ts). Patches to enable more of them, or all, are welcome. Else we
will probably add them on request only (unless I suddently get plenty of
free time, but it's rather improbable).
--
Jean Delvare
http://www.ensicaen.ismra.fr/~delvare/
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (17 preceding siblings ...)
2005-05-19 6:24 ` Lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (7 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitry,
> > Unfortunately, this could be an indication that PWM is not wired on this
> > board. the two Gigabyte boards using the it87 driver I have access to
> > and that have their PWM lines wired do have the option in BIOS.
>
> Does it mean, I can do nothing if PWM lines are not wired?
"Nothing" is a bit exagerated. This means that you cannot use your
IT8712F chip for fan speed control. There are other solutions to lower
the speed of your fans:
1* Add a hardware fan speed controller between the motherboard fan header
and the fan.
2* Change the fan for a lower speed model.
3* Change the fan for a thermoregulated model.
Obviously these solutions are more expensive and less flexible than what
you could have done with the IT8712F chip, but it now seems like these
are the only options left. You'll have to read some specialized sites
about low-noise computing and do some hardware hacking.
> Now to become sure about the wiring of PWM lines?
Your experiments seem to prove that they are not wired. One way to
confirm this would be to contact the Gygabyte technical support and ask
them:
http://www.giga-byte.com/TechSupport/ServiceCenter.htm
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (18 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (6 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitri,
> I've a problem with setting fan control. After I find your
> suggestions on http://lkml.org/lkml/2005/1/19/244 I hope you could help
> me.
>
> I have:
> Handle 0x0002
> DMI type 2, 8 bytes.
> Base Board Information
> Manufacturer: Gigabyte Technology Co., Ltd.
> Product Name: 8I845GV
> Version: 1.x
>
> lm_sensors detected:
>
> # I2C adapter drivers
> modprobe i2c-i810
> modprobe i2c-i801
> modprobe i2c-isa
> # I2C chip drivers
> modprobe eeprom
> modprobe it87
>
> But there are no pwm registers in /sys/bus/i2c/devices/*/
Which kernel version are you using?
Check your logs when loading the it87 driver. Do you see a complaint
about bogus fan polarity, or any other warning?
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (19 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (5 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitry,
> > Which kernel version are you using?
>
> 2.6.11.5
Unfortunately, this kernel version does not have support for fan polarity
fix yet. You will have to upgrade to 2.6.12-rc2 or more (I would suggest
-rc4, obviously).
> it87 3-0290: detected broken BIOS defaults, disabling pwm interface
So this means that your BIOS did not program the PWM settings properly
(or at least we think it did not). You should first search for a BIOS
upgrade for your motherboard. If none is available, you should contact
Gigabyte and tell them about your problem.
Then, if all of the above failed, you may upgrade to 2.6.12-rc2 or later
and try:
modprobe it87 fix_pwm_polarity=1
Note that this is still experimental, so be very careful that your fans
don't stop.
You should now see the pwm control files in /sys, and if the workaround
works for you, you should be able to control the speed of your fans.
Hope that helps,
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (20 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (4 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitry,
> I've upgraded BIOS to the latest version (4), downloaded rc4, copied
> it87.c into 2.6.11.5 branch, reinstalled kernel. Now pwm registers are
> seen.
OK. Did you have to use the "fix_pwm_polarity" parameter, or did the
BIOS upgrade fix it for you?
> But during pwmconfig fan doesn't stop:
I would recommend that you first experiment manually with the pwm files.
pwmconfig works fine in the general case, but will hide most problems to
you.
> It looks I'm near the end but I need your help.
First, check your BIOS for an automatic fan speed adjustment option. Is
there one? If there is one, disable it.
Then start experimenting with the PWM values manually, after you checked
that pwmN_enable is set to 1 (manual control enabled).
One common pitfall with PWM is that motherboards may not be properly
wired for fan speed control. Are you certain that your motherboard
supports it?
Another possible problem is that your fan has internal thermal
regulation. This would explain why the speed seems to increase over
time: the temperature is likely to increase over time as well. In this
case there's not much you can do with lm_sensors. If the CPU fan is
very loud, either change it for a quiter model (possibly not
thermoregulated), or add a 12 cm fan in your computer case. Large fans
are quiter, and the additional air flow might lower the temperature
enough to let the CPU fan spin down a bit, for a lower total noise.
Yet another possible problem is that the PWM frequency may not suit your
fan. We have had several reports of this lately. You may try other
frequencies but unfortunately there is no sysfs interface for this yet,
so you'd have to get the datasheet and set the register values manually
(using the isaset tool). Only try this is everything else failed and you
are certain that your motherboard supports fan speed control.
> Additionally I've dicovered that CPU temp is negative.
> (...)
> M/B Temp: +25?C (low = +15?C, high = +40?C) sensor = thermistor
> CPU Temp: -55?C (low = +15?C, high = +45?C) sensor = thermistor
> Temp3: +49?C (low = +15?C, high = +45?C) sensor = diode
I don't think it is. More likely, temp2 is not wired and temp3 is the
CPU temperature. It's only a matter of labelling them properly (edit
/etc/sensors.conf).
You can try changing temp2's sensor type from thermistor to diode, just
in case, but chances are that this input simply has nothing connected to
it. Feel free to add an "ignore" statement for it if this is the case.
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (21 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Jean Delvare
` (3 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitry,
> > First, check your BIOS for an automatic fan speed adjustment option. Is
> > there one? If there is one, disable it.
>
> I've found only 'CPU Warning Temperature', 'CPU FAN fail warnig',
> and 'System FAN fail warning' all disabled.
Unfortunately, this could be an indication that PWM is not wired on this
board. the two Gigabyte boards using the it87 driver I have access to
and that have their PWM lines wired do have the option in BIOS.
> It is like the same motherboard with Windoze is silent, though I'm not
> sure.
Please make sure ;) If the motherboard is actually silent under Windows,
please try to find out what the difference can be. Maybe you are using a
specific tool under Windows? If your CPU supports some form of power
saving or throttling, maybe it is enabled under Windows but not under
Linux (which could result in a faster fan speed if your fan is
themoregulated).
> Where to find the datasheet?
On ITE's web site:
http://www.ite.com.tw/product_info/file/pc/IT8712F_V0.8.2.pdf
> There is a second definition in the file about it87. Maybe I'd
> better to use it?
I can't tell, all motherboards are different. For the temperatures, you
would most likely simply do:
ignore temp2
label temp3 "CPU Temp"
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (22 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Jean Delvare
2005-05-19 6:25 ` lm_sensors Shue David R Contr AFRL/IFTC
` (2 subsequent siblings)
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
Hi Dmitry,
> My kernel is win4lin enabled (patched). So is it possible to upgrade
> to 2.6.12-rc* only 'drivers/i2c' branch of the kernel or I'll have
> problems?
I just checked, you can actually copy the it87.c file itself alone from
2.6.12-rc4 to your current kernel tree, and that should work just fine.
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (23 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Jean Delvare
@ 2005-05-19 6:25 ` Shue David R Contr AFRL/IFTC
2005-05-21 19:26 ` [lm-sensors] lm_sensors Jean Delvare
2005-05-23 15:14 ` [lm-sensors] lm_sensors Jean Delvare
26 siblings, 0 replies; 31+ messages in thread
From: Shue David R Contr AFRL/IFTC @ 2005-05-19 6:25 UTC (permalink / raw)
To: lm-sensors
I have a question please...
I run lm_sensor on my machines and some do not seem to be changing ever,
even with a significant change in room temperature. If I perform a reboot,
the temperature does change and to what I believe is a more accurate
(believable) temperature. I am beginning to think my temperature sensor
readings are not updating. Is there a way to update, restart, stop/start
the sensors software information providing the data?
Thanks in advance.
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.lm-sensors.org/pipermail/lm-sensors/attachments/20050511/afb8a37b/attachment.html
^ permalink raw reply [flat|nested] 31+ messages in thread* [lm-sensors] Re: lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (24 preceding siblings ...)
2005-05-19 6:25 ` lm_sensors Shue David R Contr AFRL/IFTC
@ 2005-05-21 19:26 ` Jean Delvare
2005-05-23 15:14 ` [lm-sensors] lm_sensors Jean Delvare
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-21 19:26 UTC (permalink / raw)
To: lm-sensors
Hi Dave,
Sorry for the delay...
> I have a question please...
>
> I run lm_sensor on my machines and some do not seem to be changing
> ever, even with a significant change in room temperature. If I
> perform a reboot, the temperature does change and to what I believe is
> a more accurate (believable) temperature. I am beginning to think my
> temperature sensor readings are not updating. Is there a way to
> update, restart, stop/start the sensors software information providing
> the data?
I think this problem isn't unheard of, but I don't think we ever found
what the problem was. Nevertheless, can you tell us:
* the version of the Linux kernel you use,
* the version of lm_sensors you use,
* which hardware and driver you are having the problem with?
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread* [lm-sensors] RE: lm_sensors
2005-05-19 6:23 lm_sensors Pam Huntley
` (25 preceding siblings ...)
2005-05-21 19:26 ` [lm-sensors] lm_sensors Jean Delvare
@ 2005-05-23 15:14 ` Jean Delvare
26 siblings, 0 replies; 31+ messages in thread
From: Jean Delvare @ 2005-05-23 15:14 UTC (permalink / raw)
To: lm-sensors
Hi Dave,
> Kernel 2.4.20
>
> lm_sensor-2.6.5
Iik. These are old (over 2.5 year old!). If there's a bug in there,
chances are good that they have been fixed since. You should consider
trying a more recent version of i2c & lm_sensors.
> Was not exactly sure what you meant by what driver/hardware I am having
> problems with?
Motherboard manufacturer and model, and hardware monitoring chip name.
See the first line of the "sensors" output for the latter. Actually
the full output of "sensors" would be even more helpful. But, as I
said right above, your best chance is to try a more recent version of
our code.
--
Jean Delvare
^ permalink raw reply [flat|nested] 31+ messages in thread