* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
@ 2010-11-18 16:50 Richard Retanubun
2010-11-18 17:32 ` Guenter Roeck
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Richard Retanubun @ 2010-11-18 16:50 UTC (permalink / raw)
To: lm-sensors
On 11/18/10 11:22, Ira W. Snyder wrote:
> On Thu, Nov 18, 2010 at 10:16:27AM -0500, Richard Retanubun wrote:
>> Some bits in the fault register can be useful to differentiate
>> between a planned reset (reboot) or an unplanned reset (pwr-loss).
>> The EN bit can be used for this detection when a board's planned
>> reboot action toggles the EN bit and cuts the regulated voltage
>> (but keeping the hotswap device alive), meanwhile an unplanned
>> reset (pwr-loss) will not have the EN bit set because even the
>> hotswap device got powered off.
>>
>> So, before clearing the fault register at boot, dump the contents
>> of the fault register so that other tools can use it as a forensic
>> marker to differentiate events that preceeds this boot.
>> ---
>> Hi Ira and Jean,
>>
>> Not sure if there is a mailing list for hwmon, I did not see it,
>> the MAINTAINERS file did not help, but git history shows that you two
>> are the people who are working on this driver in the patch.
>>
Hi Ira,
Thanks for the speedy reply.
>
> The mailing list you wanted was lm-sensors@lm-sensors.org.
Thanks, was not sure, cc-ed now.
>
> As for the patch, I think it would be more likely to be accepted with a few changes:
> - only print the register if certain bits are set
> - don't print the hex value, print something meaningful
Sure, but which bits? almost all of them means something to someone depending on how the chip is used.
This is why I opted to a more "interpretation-neutral" hex-value.
I also chose "FReg" because some tools may catch the word "Fault" and flag this dump as something scary :)
I can do basic bit to string mapping using the register names,
something like (GPIO3, GPIO2, FET, EN, PwrBad, OverCurrent, UnderVoltage, OverVoltage)
in the case where all the bits are set, is this what you have in mind?
Also, another feedback I got was to not only print this at device_probe, but also
save the initial value and display it as a device attribute exposed via sysfs,
do you have any comments on this approach?
Btw, in your commit, you cleared the bits because of spurious bits being set at boot,
does this observation pertain only to the [Over|Under][Current|Voltage] bits
or does other bits also tends to be spurious? just curious.
Thanks again for your time.
- Richard
> FYI, the reason the register is cleared is to avoid displaying power-on
> faults to the user. Would it be more useful to clear all bits except the
> EN bit?
>
> Ira
>
>> drivers/hwmon/ltc4215.c | 4 ++++
>> 1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/hwmon/ltc4215.c b/drivers/hwmon/ltc4215.c
>> index 40ebdfc..fa18e40 100644
>> --- a/drivers/hwmon/ltc4215.c
>> +++ b/drivers/hwmon/ltc4215.c
>> @@ -278,6 +278,10 @@ static int ltc4215_probe(struct i2c_client *client,
>> i2c_set_clientdata(client, data);
>> mutex_init(&data->update_lock);
>>
>> + /* Print fault register status at boot, before clearing it */
>> + dev_info(&client->dev, "FReg: 0x%0x\n",
>> + i2c_smbus_read_byte_data(client, LTC4215_FAULT));
>> +
>> /* Initialize the LTC4215 chip */
>> i2c_smbus_write_byte_data(client, LTC4215_FAULT, 0x00);
>>
>> --
>> 1.7.2.3
>>
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
@ 2010-11-18 17:32 ` Guenter Roeck
2010-11-18 17:45 ` Richard Retanubun
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2010-11-18 17:32 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 11:50:35AM -0500, Richard Retanubun wrote:
> On 11/18/10 11:22, Ira W. Snyder wrote:
> > On Thu, Nov 18, 2010 at 10:16:27AM -0500, Richard Retanubun wrote:
> >> Some bits in the fault register can be useful to differentiate
> >> between a planned reset (reboot) or an unplanned reset (pwr-loss).
> >> The EN bit can be used for this detection when a board's planned
> >> reboot action toggles the EN bit and cuts the regulated voltage
> >> (but keeping the hotswap device alive), meanwhile an unplanned
> >> reset (pwr-loss) will not have the EN bit set because even the
> >> hotswap device got powered off.
> >>
> >> So, before clearing the fault register at boot, dump the contents
> >> of the fault register so that other tools can use it as a forensic
> >> marker to differentiate events that preceeds this boot.
> >> ---
> >> Hi Ira and Jean,
> >>
> >> Not sure if there is a mailing list for hwmon, I did not see it,
> >> the MAINTAINERS file did not help, but git history shows that you two
> >> are the people who are working on this driver in the patch.
> >>
> Hi Ira,
>
> Thanks for the speedy reply.
>
> >
> > The mailing list you wanted was lm-sensors@lm-sensors.org.
> Thanks, was not sure, cc-ed now.
> >
> > As for the patch, I think it would be more likely to be accepted with a few changes:
> > - only print the register if certain bits are set
> > - don't print the hex value, print something meaningful
> Sure, but which bits? almost all of them means something to someone depending on how the chip is used.
> This is why I opted to a more "interpretation-neutral" hex-value.
>
> I also chose "FReg" because some tools may catch the word "Fault" and flag this dump as something scary :)
>
> I can do basic bit to string mapping using the register names,
> something like (GPIO3, GPIO2, FET, EN, PwrBad, OverCurrent, UnderVoltage, OverVoltage)
> in the case where all the bits are set, is this what you have in mind?
>
> Also, another feedback I got was to not only print this at device_probe, but also
> save the initial value and display it as a device attribute exposed via sysfs,
> do you have any comments on this approach?
>
> Btw, in your commit, you cleared the bits because of spurious bits being set at boot,
> does this observation pertain only to the [Over|Under][Current|Voltage] bits
> or does other bits also tends to be spurious? just curious.
>
I can not comment on ltc4215 behavior, but ltc4261 tends to have alarm bits
set after boot. To display that would simply add noise and not provide any value.
If the ltc4215 does the same, adding the log message would only (and unnecessarily)
cause users to be concerned. This in turn would subject the mailing list, and thus
the maintainers, to e-mails such as "I get this message during boot, is this
a problem ?".
I don't like the proposed change, unless someone can convince me that it adds value
and not just noise (and maybe volunteers to forever reply to the resulting e-mails).
> Thanks again for your time.
>
> - Richard
>
> > FYI, the reason the register is cleared is to avoid displaying power-on
> > faults to the user. Would it be more useful to clear all bits except the
> > EN bit?
> >
So the EN bit changed state, which might just mean that the chip was enabled at some point
in the past. What would be the value of displaying that information ?
If you want to do anything, you might consider displaying the _current_ EN state
if the chip is disabled, or maybe not even instantiate the driver in the first place
if that is the case.
Thanks,
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
2010-11-18 17:32 ` Guenter Roeck
@ 2010-11-18 17:45 ` Richard Retanubun
2010-11-18 18:23 ` Guenter Roeck
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Richard Retanubun @ 2010-11-18 17:45 UTC (permalink / raw)
To: lm-sensors
> I can not comment on ltc4215 behavior, but ltc4261 tends to have alarm bits
> set after boot. To display that would simply add noise and not provide any value.
> If the ltc4215 does the same, adding the log message would only (and unnecessarily)
> cause users to be concerned. This in turn would subject the mailing list, and thus
> the maintainers, to e-mails such as "I get this message during boot, is this
> a problem ?".
>
> I don't like the proposed change, unless someone can convince me that it adds value
> and not just noise (and maybe volunteers to forever reply to the resulting e-mails).
>
Agreed, the print at boot may not be a good idea, I am working on adding this as
an attribute that can be checked via sysfs and stop printing this at boot.
> So the EN bit changed state, which might just mean that the chip was enabled at some point
> in the past. What would be the value of displaying that information ?
Consider this scenario:
If the SW reboots the system, and it does so by toggling EN-bit when this happens the ltc4125 cuts out the voltage it is regulating and (most) of the components reboots,
but because it is still alive (its VDD is still good) it will set the EN-toggled-bit, which will be detected at next-boot.
If the system actually loses power, the ltc4215 itself is powered down, and at next-boot the EN-toggle-bit will not be set.
Using this, one can differentiate a "SW-reboot" Vs "A real power-down because external power is cut lost"
> If you want to do anything, you might consider displaying the _current_ EN state
I am considering displaying both the fault-reg value at boot and its current one in the sysfs access method.
> if the chip is disabled, or maybe not even instantiate the driver in the first place
> if that is the case.
>
> Thanks,
> Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
2010-11-18 17:32 ` Guenter Roeck
2010-11-18 17:45 ` Richard Retanubun
@ 2010-11-18 18:23 ` Guenter Roeck
2010-11-18 18:45 ` Ira W. Snyder
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2010-11-18 18:23 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 12:45:34PM -0500, Richard Retanubun wrote:
> > I can not comment on ltc4215 behavior, but ltc4261 tends to have alarm bits
> > set after boot. To display that would simply add noise and not provide any value.
> > If the ltc4215 does the same, adding the log message would only (and unnecessarily)
> > cause users to be concerned. This in turn would subject the mailing list, and thus
> > the maintainers, to e-mails such as "I get this message during boot, is this
> > a problem ?".
> >
> > I don't like the proposed change, unless someone can convince me that it adds value
> > and not just noise (and maybe volunteers to forever reply to the resulting e-mails).
> >
> Agreed, the print at boot may not be a good idea, I am working on adding this as
> an attribute that can be checked via sysfs and stop printing this at boot.
>
That would violate the sysfs abi.
> > So the EN bit changed state, which might just mean that the chip was enabled at some point
> > in the past. What would be the value of displaying that information ?
>
> Consider this scenario:
> If the SW reboots the system, and it does so by toggling EN-bit when this happens the ltc4125 cuts out the voltage it is regulating and (most) of the components reboots,
> but because it is still alive (its VDD is still good) it will set the EN-toggled-bit, which will be detected at next-boot.
>
> If the system actually loses power, the ltc4215 itself is powered down, and at next-boot the EN-toggle-bit will not be set.
>
> Using this, one can differentiate a "SW-reboot" Vs "A real power-down because external power is cut lost"
>
Lots of context knowledge. That applies to one specific system doing this, but not to all the others.
Some other system may have the EN toggled bit set after a power on reboot because of the way it is enabled
after power on.
A more common and chip independent solution to such scenarios would be to have a dedicated register
in the system to log reboot reasons. This can be anything - nvram, a chip register, or something else.
I don't think it is a good idea to make assumptions about reboot reasons based on the behavior
of one specific chip.
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
` (2 preceding siblings ...)
2010-11-18 18:23 ` Guenter Roeck
@ 2010-11-18 18:45 ` Ira W. Snyder
2010-11-18 19:05 ` Ira W. Snyder
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Ira W. Snyder @ 2010-11-18 18:45 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 10:23:08AM -0800, Guenter Roeck wrote:
> On Thu, Nov 18, 2010 at 12:45:34PM -0500, Richard Retanubun wrote:
> > > I can not comment on ltc4215 behavior, but ltc4261 tends to have alarm bits
> > > set after boot. To display that would simply add noise and not provide any value.
> > > If the ltc4215 does the same, adding the log message would only (and unnecessarily)
> > > cause users to be concerned. This in turn would subject the mailing list, and thus
> > > the maintainers, to e-mails such as "I get this message during boot, is this
> > > a problem ?".
> > >
> > > I don't like the proposed change, unless someone can convince me that it adds value
> > > and not just noise (and maybe volunteers to forever reply to the resulting e-mails).
> > >
> > Agreed, the print at boot may not be a good idea, I am working on adding this as
> > an attribute that can be checked via sysfs and stop printing this at boot.
> >
> That would violate the sysfs abi.
>
> > > So the EN bit changed state, which might just mean that the chip was enabled at some point
> > > in the past. What would be the value of displaying that information ?
> >
> > Consider this scenario:
> > If the SW reboots the system, and it does so by toggling EN-bit when this happens the ltc4125 cuts out the voltage it is regulating and (most) of the components reboots,
> > but because it is still alive (its VDD is still good) it will set the EN-toggled-bit, which will be detected at next-boot.
> >
> > If the system actually loses power, the ltc4215 itself is powered down, and at next-boot the EN-toggle-bit will not be set.
> >
> > Using this, one can differentiate a "SW-reboot" Vs "A real power-down because external power is cut lost"
> >
> Lots of context knowledge. That applies to one specific system doing this, but not to all the others.
> Some other system may have the EN toggled bit set after a power on reboot because of the way it is enabled
> after power on.
>
> A more common and chip independent solution to such scenarios would be to have a dedicated register
> in the system to log reboot reasons. This can be anything - nvram, a chip register, or something else.
> I don't think it is a good idea to make assumptions about reboot reasons based on the behavior
> of one specific chip.
>
I can offer some insight on how I'd handle this, if the LTC4215 EN bit
was the method used to detect SW reset on my board.
I use the U-Boot bootloader to load Linux. I have no need to access the
chip while the bootloader is running. It would be very easy to have the
bootloader read the value and cache it somewhere. Some possibilities
include the U-Boot environment, or an unused register (I use a mpc83xx
chip, which has two of these).
Ira
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
` (3 preceding siblings ...)
2010-11-18 18:45 ` Ira W. Snyder
@ 2010-11-18 19:05 ` Ira W. Snyder
2010-11-18 19:09 ` Guenter Roeck
2010-11-18 19:17 ` Guenter Roeck
6 siblings, 0 replies; 8+ messages in thread
From: Ira W. Snyder @ 2010-11-18 19:05 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 11:50:35AM -0500, Richard Retanubun wrote:
> On 11/18/10 11:22, Ira W. Snyder wrote:
> > On Thu, Nov 18, 2010 at 10:16:27AM -0500, Richard Retanubun wrote:
> >> Some bits in the fault register can be useful to differentiate
> >> between a planned reset (reboot) or an unplanned reset (pwr-loss).
> >> The EN bit can be used for this detection when a board's planned
> >> reboot action toggles the EN bit and cuts the regulated voltage
> >> (but keeping the hotswap device alive), meanwhile an unplanned
> >> reset (pwr-loss) will not have the EN bit set because even the
> >> hotswap device got powered off.
> >>
> >> So, before clearing the fault register at boot, dump the contents
> >> of the fault register so that other tools can use it as a forensic
> >> marker to differentiate events that preceeds this boot.
> >> ---
> >> Hi Ira and Jean,
> >>
> >> Not sure if there is a mailing list for hwmon, I did not see it,
> >> the MAINTAINERS file did not help, but git history shows that you two
> >> are the people who are working on this driver in the patch.
> >>
> Hi Ira,
>
> Thanks for the speedy reply.
>
> >
> > The mailing list you wanted was lm-sensors@lm-sensors.org.
> Thanks, was not sure, cc-ed now.
> >
> > As for the patch, I think it would be more likely to be accepted with a few changes:
> > - only print the register if certain bits are set
> > - don't print the hex value, print something meaningful
> Sure, but which bits? almost all of them means something to someone depending on how the chip is used.
> This is why I opted to a more "interpretation-neutral" hex-value.
>
> I also chose "FReg" because some tools may catch the word "Fault" and flag this dump as something scary :)
>
> I can do basic bit to string mapping using the register names,
> something like (GPIO3, GPIO2, FET, EN, PwrBad, OverCurrent, UnderVoltage, OverVoltage)
> in the case where all the bits are set, is this what you have in mind?
>
> Also, another feedback I got was to not only print this at device_probe, but also
> save the initial value and display it as a device attribute exposed via sysfs,
> do you have any comments on this approach?
>
> Btw, in your commit, you cleared the bits because of spurious bits being set at boot,
> does this observation pertain only to the [Over|Under][Current|Voltage] bits
> or does other bits also tends to be spurious? just curious.
>
I just booted a board and checked: I only saw the "Undervoltage fault
occurred" bit set (the register value was 0x02).
The LTC4215_FAULT register isn't used by the Linux driver at all. How
about doing this instead:
Change the ltc4215_probe() routine to do a read-modify-write of the
LTC4215_FAULT register, like this:
s32 val;
val = i2c_smbus_read_byte_data(client, LTC4215_FAULT);
val &= 0xf0; /* clear spurious power-on faults */
i2c_smbus_write_byte_data(client, LTC4215_FAULT, val);
Since the ltc4215 driver doesn't use the LTC4215_FAULT register at all,
you can use userspace to read the value. Like this:
$ i2cget -f -y 0x0 0x44 3 b
0x00
Of course, you can do this with C code rather than using i2cget: look at
the i2cget source for an example. :)
Ira
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
` (4 preceding siblings ...)
2010-11-18 19:05 ` Ira W. Snyder
@ 2010-11-18 19:09 ` Guenter Roeck
2010-11-18 19:17 ` Guenter Roeck
6 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2010-11-18 19:09 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 01:45:29PM -0500, Ira W. Snyder wrote:
[ ... ]
> >
> > A more common and chip independent solution to such scenarios would be to have a dedicated register
> > in the system to log reboot reasons. This can be anything - nvram, a chip register, or something else.
> > I don't think it is a good idea to make assumptions about reboot reasons based on the behavior
> > of one specific chip.
> >
>
> I can offer some insight on how I'd handle this, if the LTC4215 EN bit
> was the method used to detect SW reset on my board.
>
> I use the U-Boot bootloader to load Linux. I have no need to access the
> chip while the bootloader is running. It would be very easy to have the
> bootloader read the value and cache it somewhere. Some possibilities
> include the U-Boot environment, or an unused register (I use a mpc83xx
> chip, which has two of these).
>
Even then it might be simpler to just store the reset reason in one of the unused
registers and access it after the reboot. This assumes, of course, that the registers
retain their value after a reboot and have a well defined content after poweron.
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lm-sensors] [PATCH 3/3] Print the status of fault register at
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
` (5 preceding siblings ...)
2010-11-18 19:09 ` Guenter Roeck
@ 2010-11-18 19:17 ` Guenter Roeck
6 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2010-11-18 19:17 UTC (permalink / raw)
To: lm-sensors
On Thu, Nov 18, 2010 at 02:05:07PM -0500, Ira W. Snyder wrote:
[ .. ]
>
> I just booted a board and checked: I only saw the "Undervoltage fault
> occurred" bit set (the register value was 0x02).
>
Sounds similar to what I see on ltc4261.
> The LTC4215_FAULT register isn't used by the Linux driver at all. How
> about doing this instead:
>
> Change the ltc4215_probe() routine to do a read-modify-write of the
> LTC4215_FAULT register, like this:
>
> s32 val;
> val = i2c_smbus_read_byte_data(client, LTC4215_FAULT);
> val &= 0xf0; /* clear spurious power-on faults */
> i2c_smbus_write_byte_data(client, LTC4215_FAULT, val);
>
You still don't know what other boards do with the EN bit, though.
> Since the ltc4215 driver doesn't use the LTC4215_FAULT register at all,
> you can use userspace to read the value. Like this:
> $ i2cget -f -y 0x0 0x44 3 b
> 0x00
>
I don't think that will work if the ltc4215 driver is loaded, since loading the driver
blocks the i2c address. You can do that, though, prior to loading the ltc4215 driver.
But then you don't need to touch the driver either, because that code would execute
before the driver is loaded.
Something like
modprobe i2c-dev
status=$(i2cget -f -y 0x0 0x44 3 b)
echo ltc4215 0x44 >/sys/class/i2c-adapter/i2c-0/new_device
should do it.
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-11-18 19:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-18 16:50 [lm-sensors] [PATCH 3/3] Print the status of fault register at Richard Retanubun
2010-11-18 17:32 ` Guenter Roeck
2010-11-18 17:45 ` Richard Retanubun
2010-11-18 18:23 ` Guenter Roeck
2010-11-18 18:45 ` Ira W. Snyder
2010-11-18 19:05 ` Ira W. Snyder
2010-11-18 19:09 ` Guenter Roeck
2010-11-18 19:17 ` Guenter Roeck
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.