netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Vadim Pasternak <vadimp@mellanox.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"rui.zhang@intel.com" <rui.zhang@intel.com>,
	"edubezval@gmail.com" <edubezval@gmail.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>, mlxsw <mlxsw@mellanox.com>,
	Michael Shych <michaelsh@mellanox.com>
Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon interface with FAN fault attribute
Date: Tue, 26 Jun 2018 11:32:44 -0700	[thread overview]
Message-ID: <20180626183244.GB4307@roeck-us.net> (raw)
In-Reply-To: <HE1PR0502MB3753FB368101C4B5B5849BE6A2490@HE1PR0502MB3753.eurprd05.prod.outlook.com>

On Tue, Jun 26, 2018 at 04:47:05PM +0000, Vadim Pasternak wrote:
> 
> 
> > -----Original Message-----
> > From: Guenter Roeck [mailto:linux@roeck-us.net]
> > Sent: Tuesday, June 26, 2018 7:33 PM
> > To: Vadim Pasternak <vadimp@mellanox.com>
> > Cc: Andrew Lunn <andrew@lunn.ch>; davem@davemloft.net;
> > netdev@vger.kernel.org; rui.zhang@intel.com; edubezval@gmail.com;
> > jiri@resnulli.us; mlxsw <mlxsw@mellanox.com>; Michael Shych
> > <michaelsh@mellanox.com>
> > Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon interface
> > with FAN fault attribute
> > 
> > On Tue, Jun 26, 2018 at 02:47:01PM +0000, Vadim Pasternak wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Andrew Lunn [mailto:andrew@lunn.ch]
> > > > Sent: Tuesday, June 26, 2018 5:29 PM
> > > > To: Vadim Pasternak <vadimp@mellanox.com>
> > > > Cc: davem@davemloft.net; netdev@vger.kernel.org; linux@roeck-us.net;
> > > > rui.zhang@intel.com; edubezval@gmail.com; jiri@resnulli.us; mlxsw
> > > > <mlxsw@mellanox.com>; Michael Shych <michaelsh@mellanox.com>
> > > > Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon
> > > > interface with FAN fault attribute
> > > >
> > > > > +static ssize_t mlxsw_hwmon_fan_fault_show(struct device *dev,
> > > > > +					  struct device_attribute *attr,
> > > > > +					  char *buf)
> > > > > +{
> > > > > +	struct mlxsw_hwmon_attr *mlwsw_hwmon_attr =
> > > > > +			container_of(attr, struct mlxsw_hwmon_attr,
> > > > dev_attr);
> > > > > +	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> > > > > +	char mfsm_pl[MLXSW_REG_MFSM_LEN];
> > > > > +	u16 tach;
> > > > > +	int err;
> > > > > +
> > > > > +	mlxsw_reg_mfsm_pack(mfsm_pl, mlwsw_hwmon_attr->type_index);
> > > > > +	err = mlxsw_reg_query(mlxsw_hwmon->core, MLXSW_REG(mfsm),
> > > > mfsm_pl);
> > > > > +	if (err) {
> > > > > +		dev_err(mlxsw_hwmon->bus_info->dev, "Failed to query
> > > > fan\n");
> > > > > +		return err;
> > > > > +	}
> > > > > +	tach = mlxsw_reg_mfsm_rpm_get(mfsm_pl);
> > > > > +
> > > > > +	return sprintf(buf, "%u\n", (tach < mlxsw_hwmon->tach_min) ? 1 :
> > > > > +0); }
> > > >
> > > > Documentation/hwmon/sysfs-interface says:
> > > >
> > > > Alarms are direct indications read from the chips. The drivers do
> > > > NOT make comparisons of readings to thresholds. This allows
> > > > violations between readings to be caught and alarmed. The exact
> > > > definition of an alarm (for example, whether a threshold must be met
> > > > or must be exceeded to cause an alarm) is chip-dependent.
> > > >
> > > > Now, this is a fault, not an alarm. But does the same apply?
> > >
> > Yes, it does. There are no "soft" alarms / faults.
> > 
> > > Hi Andrew,
> > >
> > > Hardware provides minimum value for tachometer.
> > > Tachometer is considered as faulty in case it's below this value.
> > 
> > This is for user space to decide, not for the kernel.
> 
> Hi Guenter,
> 
> Do you suggest to expose provide fan{x}_min, instead of fan{x}_fault
> and give to user to compare fan{x}_input versus fan{x}_min for the
> fault decision?
> 

fanX_min only makes sense if programmed into or reported by the chip
or controller (that is what the attribute is for), usually to enable
the chip/controller to set an alarm. If the chip or controller does
not have a minimum speed register, the attribute should not exist,
and any decision based on a comparison between a minimum fan speed
and the actual fan speed is a user space problem.

I don't know what the tach_min calculation is about, but setting
it to the minimum of all tachometer speeds (or of all reported
minimums ?) is not the task of a hwmon driver. A hwmon driver
reports what it gets from hardware; the interpretation is up
to other parts of the system (eg userspace or the thermal
subsystem). That includes a software-based decision if an alarm
or fault should be reported or not.

> > 
> > > In case any tachometer is faulty, PWM according to the system
> > > requirements should be set to 100% until the fault
> > 
> > system requirements. Again, this is for user space to decide.
> 
> 
> Yes, user should decide in this case and I wanted to provide to user
> fan{x}_fault for this matter. But it could do it based on input and min
> attributes, of course.
> 
Note that "fault" and "alarm" do have distinct different meanings.
Many fan controllers can detect if a fan is faulty (eg no sensor
connected or it is deemed faulty) or if it just runs too slow.
The typical remedy is also different: A slow fan may just need
more pwm or voltage, a faulty fan needs to be replaced.

Guenter

> > 
> > > is not recovered (f.e. by physical replacing of bad unit).
> > > This is the motivation to expose fan{x}_fault in the way it's exposed.
> > >
> > > Thanks,
> > > Vadim.
> > >
> > > >
> > > >      Andrew

  reply	other threads:[~2018-06-26 18:32 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-26 12:10 [patch net-next RFC 00/12] mlxsw thermal monitoring amendments Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 01/12] mlxsw: spectrum: Move QSFP EEPROM defenitons to common location Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 02/12] mlxsw: reg: Add MTBR register Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 03/12] mlxsw: core: Add core environment module for port temperature reading Vadim Pasternak
2018-06-26 14:22   ` Andrew Lunn
2018-06-26 17:00     ` Guenter Roeck
2018-06-26 17:50       ` Vadim Pasternak
2018-06-26 18:18         ` Andrew Lunn
2018-06-26 19:01           ` Vadim Pasternak
2018-06-26 19:35             ` Andrew Lunn
2018-06-26 12:10 ` [patch net-next RFC 04/12] mlxsw: core: Add bus frequency capability flag for the bus type Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 05/12] mlxsw: core: Set different thermal polling time based on " Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 06/12] mlxsw: core: Modify thermal zone definition Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 07/12] mlxsw: core: Extend thermal zone operations with get_trend method Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 08/12] mlxsw: core: Extend cooling device with cooling levels Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 09/12] mlxsw: core: Rename cooling device Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 10/12] mlxsw: core: Add ports temperature measurement to thermal algorithm Vadim Pasternak
2018-06-26 12:10 ` [patch net-next RFC 11/12] mlxsw: core: Extend hwmon interface with FAN fault attribute Vadim Pasternak
2018-06-26 14:28   ` Andrew Lunn
2018-06-26 14:47     ` Vadim Pasternak
2018-06-26 16:32       ` Guenter Roeck
2018-06-26 16:47         ` Vadim Pasternak
2018-06-26 18:32           ` Guenter Roeck [this message]
2018-06-26 12:10 ` [patch net-next RFC 12/12] mlxsw: core: Extend hwmon interface with port temperature attributes Vadim Pasternak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180626183244.GB4307@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edubezval@gmail.com \
    --cc=jiri@resnulli.us \
    --cc=michaelsh@mellanox.com \
    --cc=mlxsw@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=vadimp@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).