* [lm-sensors] i2c errors while updating sensors
@ 2011-12-23 14:44 Frans Meulenbroeks
2011-12-23 16:23 ` Guenter Roeck
2012-01-02 11:00 ` Frans Meulenbroeks
0 siblings, 2 replies; 3+ messages in thread
From: Frans Meulenbroeks @ 2011-12-23 14:44 UTC (permalink / raw)
To: lm-sensors
[-- Attachment #1.1: Type: text/plain, Size: 2258 bytes --]
I was peeking at lm75/lm80 i2c error message handing.
The problem I am facing is that I am using lm75 for temperature monitoring
and lm80 for temperature and fan tacho monitoring.
This is an embedded (powerpc) system using a 2.6.38 kernel, and accessing
the sensors through the sysfs interface.
The customer wants to be able to detect if someone accidently disconnects
the fan (or the monitoring chip dies).
I know i2c is not hot pluggable, but would have expected to be able to
detect this error condition
However, from peeking at the sources, it seems I am not.
lm75.c says (function lm75_update_device)
for (i = 0; i < ARRAY_SIZE(data->temp); i++) {
int status;
status = lm75_read_value(client, LM75_REG_TEMP[i]);
if (status < 0)
dev_dbg(&client->dev, "reg %d, err %d\n",
LM75_REG_TEMP[i], status);
else
data->temp[i] = status;
}
data->last_updated = jiffies;
data->valid = 1;
So there is an error message if dbg is on, but a user will still get the
old data as temp[i] is not updated.
Wouldn't it be nice to e.g. add a (bool) attribute data-valid or or so and
drop that to 0 on a read fail (or maybe 3 subsequent read fails)?
Or is there a better way do detect such a problem in user space (e.g. by
putting an "error" value in the var)
I've also peeked at lm80.c
Here the situation seems to be a little bit worse.
E.g. I see:
static int lm80_read_value(struct i2c_client *client, u8 reg)
{
return i2c_smbus_read_byte_data(client, reg);
}
...
data->temp =
(lm80_read_value(client, LM80_REG_TEMP) << 8) |
(lm80_read_value(client, LM80_REG_RES) & 0xf0);
but i2c_smbus_read_byte_data may return a negative value upon error.
To me it seems this is not going to return a sensible value in case the
read fails (and also not e.g. something like -1)
Is this desired? Should this be fixed? Or is there a better way to do so?
If needed, I can have a stab at the code to fix it, but in that case I
would appreciate some guidance in the direction of the preferred solution
(so my work is not rejected because it was felt a faulty solution).
Best regards & seasons greetings,
Frans
[-- Attachment #1.2: Type: text/html, Size: 2504 bytes --]
[-- Attachment #2: Type: text/plain, Size: 153 bytes --]
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [lm-sensors] i2c errors while updating sensors
2011-12-23 14:44 [lm-sensors] i2c errors while updating sensors Frans Meulenbroeks
@ 2011-12-23 16:23 ` Guenter Roeck
2012-01-02 11:00 ` Frans Meulenbroeks
1 sibling, 0 replies; 3+ messages in thread
From: Guenter Roeck @ 2011-12-23 16:23 UTC (permalink / raw)
To: lm-sensors
Hi Frans,
On Fri, Dec 23, 2011 at 09:44:35AM -0500, Frans Meulenbroeks wrote:
> I was peeking at lm75/lm80 i2c error message handing.
>
> The problem I am facing is that I am using lm75 for temperature monitoring and
> lm80 for temperature and fan tacho monitoring.
> This is an embedded (powerpc) system using a 2.6.38 kernel, and accessing the
> sensors through the sysfs interface.
>
> The customer wants to be able to detect if someone accidently disconnects the
> fan (or the monitoring chip dies).
> I know i2c is not hot pluggable, but would have expected to be able to detect
> this error condition
>
> However, from peeking at the sources, it seems I am not.
>
>
> lm75.c says (function lm75_update_device)
> for (i = 0; i < ARRAY_SIZE(data->temp); i++) {
> int status;
>
> status = lm75_read_value(client, LM75_REG_TEMP[i]);
> if (status < 0)
> dev_dbg(&client->dev, "reg %d, err %d\n",
> LM75_REG_TEMP[i], status);
> else
> data->temp[i] = status;
> }
> data->last_updated = jiffies;
> data->valid = 1;
> So there is an error message if dbg is on, but a user will still get the old
> data as temp[i] is not updated.
>
> Wouldn't it be nice to e.g. add a (bool) attribute data-valid or or so and drop
> that to 0 on a read fail (or maybe 3 subsequent read fails)?
data->valid typically already exists.
> Or is there a better way do detect such a problem in user space (e.g. by
> putting an "error" value in the var)
>
> I've also peeked at lm80.c
> Here the situation seems to be a little bit worse.
> E.g. I see:
>
> static int lm80_read_value(struct i2c_client *client, u8 reg)
> {
> return i2c_smbus_read_byte_data(client, reg);
> }
>
> ...
>
> data->temp > (lm80_read_value(client, LM80_REG_TEMP) << 8) |
> (lm80_read_value(client, LM80_REG_RES) & 0xf0);
>
> but i2c_smbus_read_byte_data may return a negative value upon error.
> To me it seems this is not going to return a sensible value in case the read
> fails (and also not e.g. something like -1)
>
> Is this desired? Should this be fixed? Or is there a better way to do so?
No, yes, and yes.
Note that returning -1 is not a good idea; one should pass on the error code.
> If needed, I can have a stab at the code to fix it, but in that case I would
> appreciate some guidance in the direction of the preferred solution (so my work
> is not rejected because it was felt a faulty solution).
>
There are multiple options. The easiest would be to set data->valid to 0 and return
the error from the _update_device() function. So you would have
if (status < 0) {
data->valid = 0;
return status;
}
in the update_device() function, and
struct lm75_data *data = lm75_update_device(dev);
if (IS_ERR(data))
return PTR_ERR(data);
...
in the _get() function.
Downside is that each read will return an error, even if the read failed on an attribute
which is not currently read. ltc4261.c is an example for this kind of handling.
Another approach would be to store the actual error in the temp[] array, and make it an int
instead of u16/u8.
int temp[3];
...
data->temp[i] = lm75_read_value(client, LM75_REG_TEMP[i]);
...
int temp = LM75_TEMP_FROM_REG(data->temp[attr->index]);
if (temp < 0)
return temp;
return sprintf(buf, "%d\n", LM75_TEMP_FROM_REG(temp));
More effort and more code, but more accurately reports errors. With this approach,
you would have to make sure that u8/u16 <-> int conversions don't introduce errors;
the arithmetic in some drivers depends on the type of the storage variable.
Example for this kind of solution is max16065.c.
Which approach to use really depends on the chip and on the amount of effort one wants
to make. I typically select the first approach for simple chips, and use the second
approach if there is a possibility that the chip returns an error on one read but not
on others. The latter is mostly a concern for microcontroller based i2c devices.
For lm75, the first method should be ok. For lm80, you could use both, depending how much
time you want to spend on it. If there is a chance that one read returns an error while
other reads don't, the second approach would be much better.
If you want to start playing with the lm80 driver, it would be nice if you could also
fix the checkpatch errors and warnings. That should be a separate patch, though.
Thanks,
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [lm-sensors] i2c errors while updating sensors
2011-12-23 14:44 [lm-sensors] i2c errors while updating sensors Frans Meulenbroeks
2011-12-23 16:23 ` Guenter Roeck
@ 2012-01-02 11:00 ` Frans Meulenbroeks
1 sibling, 0 replies; 3+ messages in thread
From: Frans Meulenbroeks @ 2012-01-02 11:00 UTC (permalink / raw)
To: lm-sensors
[-- Attachment #1.1: Type: text/plain, Size: 6112 bytes --]
Guenter,
Thanks for your reply!
I didn't get to implementing this before the Xmas holiday, but have picked
it up today.
(btw. All the best for 2012 for all readers).
See below for some remarks.
2011/12/23 Guenter Roeck <guenter.roeck@ericsson.com>
> Hi Frans,
>
> On Fri, Dec 23, 2011 at 09:44:35AM -0500, Frans Meulenbroeks wrote:
> > I was peeking at lm75/lm80 i2c error message handing.
> >
> > The problem I am facing is that I am using lm75 for temperature
> monitoring and
> > lm80 for temperature and fan tacho monitoring.
> > This is an embedded (powerpc) system using a 2.6.38 kernel, and
> accessing the
> > sensors through the sysfs interface.
> >
> > The customer wants to be able to detect if someone accidently
> disconnects the
> > fan (or the monitoring chip dies).
> > I know i2c is not hot pluggable, but would have expected to be able to
> detect
> > this error condition
> >
> > However, from peeking at the sources, it seems I am not.
> >
> >
> > lm75.c says (function lm75_update_device)
> > for (i = 0; i < ARRAY_SIZE(data->temp); i++) {
> > int status;
> >
> > status = lm75_read_value(client, LM75_REG_TEMP[i]);
> > if (status < 0)
> > dev_dbg(&client->dev, "reg %d, err %d\n",
> > LM75_REG_TEMP[i], status);
> > else
> > data->temp[i] = status;
> > }
> > data->last_updated = jiffies;
> > data->valid = 1;
> > So there is an error message if dbg is on, but a user will still get the
> old
> > data as temp[i] is not updated.
> >
> > Wouldn't it be nice to e.g. add a (bool) attribute data-valid or or so
> and drop
> > that to 0 on a read fail (or maybe 3 subsequent read fails)?
>
> data->valid typically already exists.
>
> > Or is there a better way do detect such a problem in user space (e.g. by
> > putting an "error" value in the var)
> >
> > I've also peeked at lm80.c
> > Here the situation seems to be a little bit worse.
> > E.g. I see:
> >
> > static int lm80_read_value(struct i2c_client *client, u8 reg)
> > {
> > return i2c_smbus_read_byte_data(client, reg);
> > }
> >
> > ...
> >
> > data->temp =
> > (lm80_read_value(client, LM80_REG_TEMP) << 8) |
> > (lm80_read_value(client, LM80_REG_RES) & 0xf0);
> >
> > but i2c_smbus_read_byte_data may return a negative value upon error.
> > To me it seems this is not going to return a sensible value in case the
> read
> > fails (and also not e.g. something like -1)
> >
> > Is this desired? Should this be fixed? Or is there a better way to do so?
>
> No, yes, and yes.
>
Ok, I'll bite and give it a stab.
>
> Note that returning -1 is not a good idea; one should pass on the error
> code.
>
> > If needed, I can have a stab at the code to fix it, but in that case I
> would
> > appreciate some guidance in the direction of the preferred solution (so
> my work
> > is not rejected because it was felt a faulty solution).
> >
> There are multiple options. The easiest would be to set data->valid to 0
> and return
> the error from the _update_device() function. So you would have
>
> if (status < 0) {
> data->valid = 0;
> return status;
> }
>
> in the update_device() function, and
>
> struct lm75_data *data = lm75_update_device(dev);
>
> if (IS_ERR(data))
> return PTR_ERR(data);
>
> ...
>
> in the _get() function.
>
> Downside is that each read will return an error, even if the read failed
> on an attribute
> which is not currently read. ltc4261.c is an example for this kind of
> handling.
>
Thanks. I've implemented this in the same way as ltc4261.c did.
I will supply a patch in a short while.
Not too sure what kernel git I should use to base this patch on.
I will be using git://
git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git as that is
listed as the hwmon staging tree. Hope that is ok.
I also noticed that ltc4261 is not setting data->valid to 0 in case of an
error. This seems a good addition so I'll add a patch for that.
>
> Another approach would be to store the actual error in the temp[] array,
> and make it an int
> instead of u16/u8.
>
> int temp[3];
>
> ...
>
> data->temp[i] = lm75_read_value(client, LM75_REG_TEMP[i]);
>
> ...
>
> int temp = LM75_TEMP_FROM_REG(data->temp[attr->index]);
>
> if (temp < 0)
> return temp;
> return sprintf(buf, "%d\n", LM75_TEMP_FROM_REG(temp));
>
> More effort and more code, but more accurately reports errors. With this
> approach,
> you would have to make sure that u8/u16 <-> int conversions don't
> introduce errors;
> the arithmetic in some drivers depends on the type of the storage variable.
> Example for this kind of solution is max16065.c.
>
> Which approach to use really depends on the chip and on the amount of
> effort one wants
> to make. I typically select the first approach for simple chips, and use
> the second
> approach if there is a possibility that the chip returns an error on one
> read but not
> on others. The latter is mostly a concern for microcontroller based i2c
> devices.
>
> For lm75, the first method should be ok. For lm80, you could use both,
> depending how much
> time you want to spend on it. If there is a chance that one read returns
> an error while
> other reads don't, the second approach would be much better.
>
As lm80 is not a micro based i2c dev, I would suspect that a fail apply to
all register reads.
And if a register read fails, even if it is not the register you are
interested in at that moment, it is probably interesting to know as there
is definitely a problem with the device/device communication
>
> If you want to start playing with the lm80 driver, it would be nice if you
> could also
> fix the checkpatch errors and warnings. That should be a separate patch,
> though.
>
Will peek into that too. Will be after the lm80 error patch (as I already
made that one it is more efficient to do the checkfile later)
>
> Thanks,
> Guenter
>
> Best regards, Frans
[-- Attachment #1.2: Type: text/html, Size: 7857 bytes --]
[-- Attachment #2: Type: text/plain, Size: 153 bytes --]
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-01-02 11:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-23 14:44 [lm-sensors] i2c errors while updating sensors Frans Meulenbroeks
2011-12-23 16:23 ` Guenter Roeck
2012-01-02 11:00 ` Frans Meulenbroeks
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.