Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH v2 7/9] trace_uprobe/sdt: Fix multiple update of same reference counter
From: Ravi Bangoria @ 2018-04-11  4:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: mhiramat, peterz, srikar, rostedt, acme, ananth, akpm,
	alexander.shishkin, alexis.berlemont, corbet, dan.j.williams,
	jolsa, kan.liang, kjlx, kstewart, linux-doc, linux-kernel,
	linux-mm, milian.wolff, mingo, namhyung, naveen.n.rao, pc, tglx,
	yao.jin, fengguang.wu, jglisse, Ravi Bangoria
In-Reply-To: <20180410110633.GA29063@redhat.com>

Hi Oleg,

On 04/10/2018 04:36 PM, Oleg Nesterov wrote:
> Hi Ravi,
>
> On 04/10, Ravi Bangoria wrote:
>>> and what if __mmu_notifier_register() fails simply because signal_pending() == T?
>>> see mm_take_all_locks().
>>>
>>> at first glance this all look suspicious and sub-optimal,
>> Yes. I should have added checks for failure cases.
>> Will fix them in v3.
> And what can you do if it fails? Nothing except report the problem. But
> signal_pending() is not the unlikely or error condition, it should not
> cause the tracing errors.

...

> Plus mm_take_all_locks() is very heavy... BTW, uprobe_mmap_callback() is
> called unconditionally. Whatever it does, can we at least move it after
> the no_uprobe_events() check? Can't we also check MMF_HAS_UPROBES?

Sure, I'll move it after these conditions.

> Either way, I do not feel that mmu_notifier is the right tool... Did you
> consider the uprobe_clear_state() hook we already have?

Ah! This is really a good idea. We don't need mmu_notifier then.

Thanks for suggestion,
Ravi

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
From: Guenter Roeck @ 2018-04-10 22:28 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc
In-Reply-To: <20180410183212.16787-10-jae.hyun.yoo@linux.intel.com>

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
> This commit adds PECI cputemp and dimmtemp hwmon drivers.
> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
>  drivers/hwmon/Kconfig         |  28 ++
>  drivers/hwmon/Makefile        |   2 +
>  drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>  4 files changed, 1245 insertions(+)
>  create mode 100644 drivers/hwmon/peci-cputemp.c
>  create mode 100644 drivers/hwmon/peci-dimmtemp.c
> 
> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index f249a4428458..c52f610f81d0 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>  	  This driver can also be built as a module.  If so, the module
>  	  will be called nct7904.
>  
> +config SENSORS_PECI_CPUTEMP
> +	tristate "PECI CPU temperature monitoring support"
> +	depends on OF
> +	depends on PECI
> +	help
> +	  If you say yes here you get support for the generic Intel PECI
> +	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> +	  readings of the CPU package and CPU cores that are accessible using
> +	  the PECI Client Command Suite via the processor PECI client.
> +	  Check Documentation/hwmon/peci-cputemp for details.
> +
> +	  This driver can also be built as a module.  If so, the module
> +	  will be called peci-cputemp.
> +
> +config SENSORS_PECI_DIMMTEMP
> +	tristate "PECI DIMM temperature monitoring support"
> +	depends on OF
> +	depends on PECI
> +	help
> +	  If you say yes here you get support for the generic Intel PECI hwmon
> +	  driver which provides Digital Thermal Sensor (DTS) thermal readings of
> +	  DIMM components that are accessible using the PECI Client Command
> +	  Suite via the processor PECI client.
> +	  Check Documentation/hwmon/peci-dimmtemp for details.
> +
> +	  This driver can also be built as a module.  If so, the module
> +	  will be called peci-dimmtemp.
> +
>  config SENSORS_NSA320
>  	tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
>  	depends on GPIOLIB && OF
> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> index e7d52a36e6c4..48d9598fcd3a 100644
> --- a/drivers/hwmon/Makefile
> +++ b/drivers/hwmon/Makefile
> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)	+= nct7802.o
>  obj-$(CONFIG_SENSORS_NCT7904)	+= nct7904.o
>  obj-$(CONFIG_SENSORS_NSA320)	+= nsa320-hwmon.o
>  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)	+= ntc_thermistor.o
> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
>  obj-$(CONFIG_SENSORS_PC87360)	+= pc87360.o
>  obj-$(CONFIG_SENSORS_PC87427)	+= pc87427.o
>  obj-$(CONFIG_SENSORS_PCF8591)	+= pcf8591.o
> diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
> new file mode 100644
> index 000000000000..f0bc92687512
> --- /dev/null
> +++ b/drivers/hwmon/peci-cputemp.c
> @@ -0,0 +1,783 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include <linux/delay.h>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>

Is this include needed ?

> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/peci.h>
> +
> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
> +
> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
> +
> +#define DEFAULT_CHANNEL_NUMS  5
> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
> +
> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
> +
> +#define UPDATE_INTERVAL_MIN   HZ
> +
> +enum cpu_gens {
> +	CPU_GEN_HSX, /* Haswell Xeon */
> +	CPU_GEN_BRX, /* Broadwell Xeon */
> +	CPU_GEN_SKX, /* Skylake Xeon */
> +	CPU_GEN_MAX
> +};
> +
> +struct cpu_gen_info {
> +	u32 type;
> +	u32 cpu_id;
> +	u32 core_max;
> +};
> +
> +struct temp_data {
> +	bool valid;
> +	s32  value;
> +	unsigned long last_updated;
> +};
> +
> +struct temp_group {
> +	struct temp_data die;
> +	struct temp_data dts_margin;
> +	struct temp_data tcontrol;
> +	struct temp_data tthrottle;
> +	struct temp_data tjmax;
> +	struct temp_data core[CORETEMP_CHANNEL_NUMS];
> +};
> +
> +struct peci_cputemp {
> +	struct peci_client *client;
> +	struct device *dev;
> +	char name[PECI_NAME_SIZE];
> +	struct temp_group temp;
> +	u8 addr;
> +	uint cpu_no;
> +	const struct cpu_gen_info *gen_info;
> +	u32 core_mask;
> +	u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
> +	uint config_idx;
> +	struct hwmon_channel_info temp_info;
> +	const struct hwmon_channel_info *info[2];
> +	struct hwmon_chip_info chip;
> +};
> +
> +enum cputemp_channels {
> +	channel_die,
> +	channel_dts_mrgn,
> +	channel_tcontrol,
> +	channel_tthrottle,
> +	channel_tjmax,
> +	channel_core,
> +};
> +
> +static const struct cpu_gen_info cpu_gen_info_table[] = {
> +	{ .type = CPU_GEN_HSX,
> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
> +	  .core_max = CORE_MAX_ON_HSX },
> +	{ .type = CPU_GEN_BRX,
> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
> +	  .core_max = CORE_MAX_ON_BDX },
> +	{ .type = CPU_GEN_SKX,
> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
> +	  .core_max = CORE_MAX_ON_SKX },
> +};
> +
> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
> +	/* Die temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> +	HWMON_T_CRIT_HYST,
> +
> +	/* DTS margin temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
> +
> +	/* Tcontrol temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> +
> +	/* Tthrottle temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +
> +	/* Tjmax temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +
> +	/* Core temperature - for all core channels */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> +	HWMON_T_CRIT_HYST,
> +};
> +
> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
> +	"Die",
> +	"DTS margin",
> +	"Tcontrol",
> +	"Tthrottle",
> +	"Tjmax",
> +	"Core 0", "Core 1", "Core 2", "Core 3",
> +	"Core 4", "Core 5", "Core 6", "Core 7",
> +	"Core 8", "Core 9", "Core 10", "Core 11",
> +	"Core 12", "Core 13", "Core 14", "Core 15",
> +	"Core 16", "Core 17", "Core 18", "Core 19",
> +	"Core 20", "Core 21", "Core 22", "Core 23",
> +};
> +
> +static int send_peci_cmd(struct peci_cputemp *priv,
> +			 enum peci_cmd cmd,
> +			 void *msg)
> +{
> +	return peci_command(priv->client->adapter, cmd, msg);
> +}
> +
> +static int need_update(struct temp_data *temp)

Please use bool.

> +{
> +	if (temp->valid &&
> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +static void mark_updated(struct temp_data *temp)
> +{
> +	temp->valid = true;
> +	temp->last_updated = jiffies;
> +}
> +
> +static s32 ten_dot_six_to_millidegree(s32 val)
> +{
> +	return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
> +}
> +
> +static int get_tjmax(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int rc;
> +
> +	if (!priv->temp.tjmax.valid) {
> +		msg.addr = priv->addr;
> +		msg.index = MBX_INDEX_TEMP_TARGET;
> +		msg.param = 0;
> +		msg.rx_len = 4;
> +
> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +		if (rc)
> +			return rc;
> +
> +		priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
> +		priv->temp.tjmax.valid = true;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_tcontrol(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 tcontrol_margin;
> +	s32 tthrottle_offset;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.tcontrol))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_TEMP_TARGET;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	tcontrol_margin = msg.pkg_config[1];
> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
> +
> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
> +
> +	mark_updated(&priv->temp.tcontrol);
> +	mark_updated(&priv->temp.tthrottle);
> +
> +	return 0;
> +}
> +
> +static int get_tthrottle(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 tcontrol_margin;
> +	s32 tthrottle_offset;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.tthrottle))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_TEMP_TARGET;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
> +
> +	tcontrol_margin = msg.pkg_config[1];
> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
> +
> +	mark_updated(&priv->temp.tthrottle);
> +	mark_updated(&priv->temp.tcontrol);
> +
> +	return 0;
> +}

I am quite completely missing how the two functions above are different.

> +
> +static int get_die_temp(struct peci_cputemp *priv)
> +{
> +	struct peci_get_temp_msg msg;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.die))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->temp.die.value = priv->temp.tjmax.value +
> +			       ((s32)msg.temp_raw * 1000 / 64);
> +
> +	mark_updated(&priv->temp.die);
> +
> +	return 0;
> +}
> +
> +static int get_dts_margin(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 dts_margin;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.dts_margin))
> +		return 0;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_DTS_MARGIN;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
> +
> +	/**
> +	 * Processors return a value of DTS reading in 10.6 format
> +	 * (10 bits signed decimal, 6 bits fractional).
> +	 * Error codes:
> +	 *   0x8000: General sensor error
> +	 *   0x8001: Reserved
> +	 *   0x8002: Underflow on reading value
> +	 *   0x8003-0x81ff: Reserved
> +	 */
> +	if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
> +		return -EIO;
> +
> +	dts_margin = ten_dot_six_to_millidegree(dts_margin);
> +
> +	priv->temp.dts_margin.value = dts_margin;
> +
> +	mark_updated(&priv->temp.dts_margin);
> +
> +	return 0;
> +}
> +
> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 core_dts_margin;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.core[core_index]))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
> +	msg.param = core_index;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
> +
> +	/**
> +	 * Processors return a value of the core DTS reading in 10.6 format
> +	 * (10 bits signed decimal, 6 bits fractional).
> +	 * Error codes:
> +	 *   0x8000: General sensor error
> +	 *   0x8001: Reserved
> +	 *   0x8002: Underflow on reading value
> +	 *   0x8003-0x81ff: Reserved
> +	 */
> +	if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
> +		return -EIO;
> +
> +	core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
> +
> +	priv->temp.core[core_index].value = priv->temp.tjmax.value +
> +					    core_dts_margin;
> +
> +	mark_updated(&priv->temp.core[core_index]);
> +
> +	return 0;
> +}
> +

There is a lot of duplication in those functions. Would it be possible
to find common code and use functions for it instead of duplicating
everything several times ?

> +static int find_core_index(struct peci_cputemp *priv, int channel)
> +{
> +	int core_channel = channel - DEFAULT_CHANNEL_NUMS;
> +	int idx, found = 0;
> +
> +	for (idx = 0; idx < priv->gen_info->core_max; idx++) {
> +		if (priv->core_mask & BIT(idx)) {
> +			if (core_channel == found)
> +				break;
> +
> +			found++;
> +		}
> +	}
> +
> +	return idx;

What if nothing is found ?

> +}
> +
> +static int cputemp_read_string(struct device *dev,
> +			       enum hwmon_sensor_types type,
> +			       u32 attr, int channel, const char **str)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int core_index;
> +
> +	switch (attr) {
> +	case hwmon_temp_label:
> +		if (channel < DEFAULT_CHANNEL_NUMS) {
> +			*str = cputemp_label[channel];
> +		} else {
> +			core_index = find_core_index(priv, channel);

FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
as parameter.

What if find_core_index() returns priv->gen_info->core_max, ie
if it didn't find a core ?

> +			*str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
> +		}
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_die(struct device *dev,
> +			    enum hwmon_sensor_types type,
> +			    u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_die_temp(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.die.value;
> +		return 0;
> +	case hwmon_temp_max:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	case hwmon_temp_crit_hyst:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_dts_margin(struct device *dev,
> +				   enum hwmon_sensor_types type,
> +				   u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_dts_margin(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.dts_margin.value;
> +		return 0;
> +	case hwmon_temp_min:
> +		*val = 0;
> +		return 0;

This attribute should not exist.

> +	case hwmon_temp_lcrit:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value - priv->temp.tjmax.value;

lcrit is tcontrol - tjmax, and crit_hyst above is
tjmax - tcontrol ? How does this make sense ?

> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tcontrol(struct device *dev,
> +				 enum hwmon_sensor_types type,
> +				 u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;

Am I missing something, or is the same temperature reported several times ?
tjmax is also reported as temp_crit cputemp_read_die(), for example.

> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tthrottle(struct device *dev,
> +				  enum hwmon_sensor_types type,
> +				  u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tthrottle(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tthrottle.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tjmax(struct device *dev,
> +			      enum hwmon_sensor_types type,
> +			      u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_core(struct device *dev,
> +			     enum hwmon_sensor_types type,
> +			     u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int core_index = find_core_index(priv, channel);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_core_temp(priv, core_index);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.core[core_index].value;
> +		return 0;
> +	case hwmon_temp_max:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	case hwmon_temp_crit_hyst:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}

There is again a lot of duplication in those functions.

> +
> +static int cputemp_read(struct device *dev,
> +			enum hwmon_sensor_types type,
> +			u32 attr, int channel, long *val)
> +{
> +	switch (channel) {
> +	case channel_die:
> +		return cputemp_read_die(dev, type, attr, channel, val);
> +	case channel_dts_mrgn:
> +		return cputemp_read_dts_margin(dev, type, attr, channel, val);
> +	case channel_tcontrol:
> +		return cputemp_read_tcontrol(dev, type, attr, channel, val);
> +	case channel_tthrottle:
> +		return cputemp_read_tthrottle(dev, type, attr, channel, val);
> +	case channel_tjmax:
> +		return cputemp_read_tjmax(dev, type, attr, channel, val);
> +	default:
> +		if (channel < CPUTEMP_CHANNEL_NUMS)
> +			return cputemp_read_core(dev, type, attr, channel, val);
> +
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static umode_t cputemp_is_visible(const void *data,
> +				  enum hwmon_sensor_types type,
> +				  u32 attr, int channel)
> +{
> +	const struct peci_cputemp *priv = data;
> +
> +	if (priv->temp_config[channel] & BIT(attr))
> +		return 0444;
> +
> +	return 0;
> +}
> +
> +static const struct hwmon_ops cputemp_ops = {
> +	.is_visible = cputemp_is_visible,
> +	.read_string = cputemp_read_string,
> +	.read = cputemp_read,
> +};
> +
> +static int check_resolved_cores(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pci_cfg_local_msg msg;
> +	int rc;
> +
> +	if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
> +		return -EINVAL;
> +
> +	/* Get the RESOLVED_CORES register value */
> +	msg.addr = priv->addr;
> +	msg.bus = 1;
> +	msg.device = 30;
> +	msg.function = 3;
> +	msg.reg = 0xB4;

Can this be made less magic with some defines ?

> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->core_mask = msg.pci_config[3] << 24 |
> +			  msg.pci_config[2] << 16 |
> +			  msg.pci_config[1] << 8 |
> +			  msg.pci_config[0];
> +
> +	if (!priv->core_mask)
> +		return -EAGAIN;
> +
> +	dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
> +	return 0;
> +}
> +
> +static int create_core_temp_info(struct peci_cputemp *priv)
> +{
> +	int rc, i;
> +
> +	rc = check_resolved_cores(priv);
> +	if (!rc) {
> +		for (i = 0; i < priv->gen_info->core_max; i++) {
> +			if (priv->core_mask & BIT(i)) {
> +				priv->temp_config[priv->config_idx++] =
> +						     config_table[channel_core];
> +			}
> +		}
> +	}
> +
> +	return rc;
> +}
> +
> +static int check_cpu_id(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	u32 cpu_id;
> +	int i, rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_CPU_ID;
> +	msg.param = PKG_ID_CPU_ID;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
> +
> +	for (i = 0; i < CPU_GEN_MAX; i++) {
> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
> +			priv->gen_info = &cpu_gen_info_table[i];
> +			break;
> +		}
> +	}
> +
> +	if (!priv->gen_info)
> +		return -ENODEV;
> +
> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
> +	return 0;
> +}
> +
> +static int peci_cputemp_probe(struct peci_client *client)
> +{
> +	struct device *dev = &client->dev;
> +	struct peci_cputemp *priv;
> +	struct device *hwmon_dev;
> +	int rc;
> +
> +	if ((client->adapter->cmd_mask &
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
> +		return -EINVAL;

Does this mean there will be an error message for each non-supported CPU ?
Why ?

> +	}
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	dev_set_drvdata(dev, priv);
> +	priv->client = client;
> +	priv->dev = dev;
> +	priv->addr = client->addr;
> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
> +
> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
> +		 priv->cpu_no);
> +
> +	rc = check_cpu_id(priv);
> +	if (rc) {
> +		dev_err(dev, "Client CPU is not supported\n");

-ENODEV is not an error, and should not result in an error message.
Besides, the error can also be propagated from peci core code,
and may well be something else.

> +		return rc;
> +	}
> +
> +	priv->temp_config[priv->config_idx++] = config_table[channel_die];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
> +
> +	rc = create_core_temp_info(priv);
> +	if (rc)
> +		dev_dbg(dev, "Failed to create core temp info\n");

Then what ? Shouldn't this result in probe deferral or something more useful
instead of just being ignored ?

> +
> +	priv->chip.ops = &cputemp_ops;
> +	priv->chip.info = priv->info;
> +
> +	priv->info[0] = &priv->temp_info;
> +
> +	priv->temp_info.type = hwmon_temp;
> +	priv->temp_info.config = priv->temp_config;
> +
> +	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
> +							 priv->name,
> +							 priv,
> +							 &priv->chip,
> +							 NULL);
> +
> +	if (IS_ERR(hwmon_dev))
> +		return PTR_ERR(hwmon_dev);
> +
> +	dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id peci_cputemp_of_table[] = {
> +	{ .compatible = "intel,peci-cputemp" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
> +
> +static struct peci_driver peci_cputemp_driver = {
> +	.probe  = peci_cputemp_probe,
> +	.driver = {
> +		.name           = "peci-cputemp",
> +		.of_match_table = of_match_ptr(peci_cputemp_of_table),
> +	},
> +};
> +module_peci_driver(peci_cputemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("PECI cputemp driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
> new file mode 100644
> index 000000000000..78bf29cb2c4c
> --- /dev/null
> +++ b/drivers/hwmon/peci-dimmtemp.c

FWIW, this should be two separate patches.

> @@ -0,0 +1,432 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include <linux/delay.h>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>

Needed ?

> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/peci.h>
> +#include <linux/workqueue.h>
> +
> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
> +
> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
> +
> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
> +
> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
> +
> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
> +
> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
> +
> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
> +
> +#define UPDATE_INTERVAL_MIN  HZ
> +
> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
> +
> +enum cpu_gens {
> +	CPU_GEN_HSX, /* Haswell Xeon */
> +	CPU_GEN_BRX, /* Broadwell Xeon */
> +	CPU_GEN_SKX, /* Skylake Xeon */
> +	CPU_GEN_MAX
> +};
> +
> +struct cpu_gen_info {
> +	u32 type;
> +	u32 cpu_id;
> +	u32 chan_rank_max;
> +	u32 dimm_idx_max;
> +};
> +
> +struct temp_data {
> +	bool valid;
> +	s32  value;
> +	unsigned long last_updated;
> +};
> +
> +struct peci_dimmtemp {
> +	struct peci_client *client;
> +	struct device *dev;
> +	struct workqueue_struct *work_queue;
> +	struct delayed_work work_handler;
> +	char name[PECI_NAME_SIZE];
> +	struct temp_data temp[DIMM_NUMS_MAX];
> +	u8 addr;
> +	uint cpu_no;
> +	const struct cpu_gen_info *gen_info;
> +	u32 dimm_mask;
> +	int retry_count;
> +	int channels;
> +	u32 temp_config[DIMM_NUMS_MAX + 1];
> +	struct hwmon_channel_info temp_info;
> +	const struct hwmon_channel_info *info[2];
> +	struct hwmon_chip_info chip;
> +};
> +
> +static const struct cpu_gen_info cpu_gen_info_table[] = {
> +	{ .type  = CPU_GEN_HSX,
> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
> +	{ .type  = CPU_GEN_BRX,
> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
> +	{ .type  = CPU_GEN_SKX,
> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
> +};
> +
> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
> +	{ "DIMM A0", "DIMM A1", "DIMM A2" },
> +	{ "DIMM B0", "DIMM B1", "DIMM B2" },
> +	{ "DIMM C0", "DIMM C1", "DIMM C2" },
> +	{ "DIMM D0", "DIMM D1", "DIMM D2" },
> +	{ "DIMM E0", "DIMM E1", "DIMM E2" },
> +	{ "DIMM F0", "DIMM F1", "DIMM F2" },
> +	{ "DIMM G0", "DIMM G1", "DIMM G2" },
> +	{ "DIMM H0", "DIMM H1", "DIMM H2" },
> +};
> +
> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
> +			 void *msg)
> +{
> +	return peci_command(priv->client->adapter, cmd, msg);
> +}
> +
> +static int need_update(struct temp_data *temp)
> +{
> +	if (temp->valid &&
> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +static void mark_updated(struct temp_data *temp)
> +{
> +	temp->valid = true;
> +	temp->last_updated = jiffies;
> +}

It might make sense to provide the duplicate functions in a core file.

> +
> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
> +{
> +	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> +	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int rc;
> +
> +	if (!need_update(&priv->temp[dimm_no]))
> +		return 0;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_DDR_DIMM_TEMP;
> +	msg.param = chan_rank;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
> +
> +	mark_updated(&priv->temp[dimm_no]);
> +
> +	return 0;
> +}
> +
> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
> +{
> +	int dimm_nums_max = priv->gen_info->chan_rank_max *
> +			    priv->gen_info->dimm_idx_max;
> +	int idx, found = 0;
> +
> +	for (idx = 0; idx < dimm_nums_max; idx++) {
> +		if (priv->dimm_mask & BIT(idx)) {
> +			if (channel == found)
> +				break;
> +
> +			found++;
> +		}
> +	}
> +
> +	return idx;
> +}

This again looks like duplicate code.

> +
> +static int dimmtemp_read_string(struct device *dev,
> +				enum hwmon_sensor_types type,
> +				u32 attr, int channel, const char **str)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
> +	int dimm_no, chan_rank, dimm_idx;
> +
> +	switch (attr) {
> +	case hwmon_temp_label:
> +		dimm_no = find_dimm_number(priv, channel);
> +		chan_rank = dimm_no / dimm_idx_max;
> +		dimm_idx = dimm_no % dimm_idx_max;
> +		*str = dimmtemp_label[chan_rank][dimm_idx];
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
> +			 u32 attr, int channel, long *val)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +	int dimm_no = find_dimm_number(priv, channel);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_dimm_temp(priv, dimm_no);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp[dimm_no].value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static umode_t dimmtemp_is_visible(const void *data,
> +				   enum hwmon_sensor_types type,
> +				   u32 attr, int channel)
> +{
> +	switch (attr) {
> +	case hwmon_temp_label:
> +	case hwmon_temp_input:
> +		return 0444;
> +	default:
> +		return 0;
> +	}
> +}
> +
> +static const struct hwmon_ops dimmtemp_ops = {
> +	.is_visible = dimmtemp_is_visible,
> +	.read_string = dimmtemp_read_string,
> +	.read = dimmtemp_read,
> +};
> +
> +static int check_populated_dimms(struct peci_dimmtemp *priv)
> +{
> +	u32 chan_rank_max = priv->gen_info->chan_rank_max;
> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int chan_rank, dimm_idx;
> +	int rc, channels = 0;
> +
> +	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
> +		msg.addr = priv->addr;
> +		msg.index = MBX_INDEX_DDR_DIMM_TEMP;
> +		msg.param = chan_rank;
> +		msg.rx_len = 4;
> +
> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +		if (rc) {
> +			priv->dimm_mask = 0;
> +			return rc;
> +		}
> +
> +		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
> +			if (msg.pkg_config[dimm_idx]) {
> +				priv->dimm_mask |= BIT(chan_rank *
> +						       chan_rank_max +
> +						       dimm_idx);
> +				channels++;
> +			}
> +		}
> +	}
> +
> +	if (!priv->dimm_mask)
> +		return -EAGAIN;
> +
> +	priv->channels = channels;
> +
> +	dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
> +	return 0;
> +}
> +
> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
> +{
> +	struct device *hwmon_dev;
> +	int rc, i;
> +
> +	rc = check_populated_dimms(priv);
> +	if (!rc) {

Please handle error cases first.

> +		for (i = 0; i < priv->channels; i++)
> +			priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
> +
> +		priv->chip.ops = &dimmtemp_ops;
> +		priv->chip.info = priv->info;
> +
> +		priv->info[0] = &priv->temp_info;
> +
> +		priv->temp_info.type = hwmon_temp;
> +		priv->temp_info.config = priv->temp_config;
> +
> +		hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
> +								 priv->name,
> +								 priv,
> +								 &priv->chip,
> +								 NULL);
> +		rc = PTR_ERR_OR_ZERO(hwmon_dev);
> +		if (!rc)
> +			dev_dbg(priv->dev, "%s: sensor '%s'\n",
> +				dev_name(hwmon_dev), priv->name);
> +	} else if (rc == -EAGAIN) {
> +		if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
> +			queue_delayed_work(priv->work_queue,
> +					   &priv->work_handler,
> +					   DIMM_MASK_CHECK_DELAY_JIFFIES);
> +			priv->retry_count++;
> +			dev_dbg(priv->dev,
> +				"Deferred DIMM temp info creation\n");
> +		} else {
> +			rc = -ETIMEDOUT;
> +			dev_err(priv->dev,
> +				"Timeout retrying DIMM temp info creation\n");
> +		}
> +	}
> +
> +	return rc;
> +}
> +
> +static void create_dimm_temp_info_delayed(struct work_struct *work)
> +{
> +	struct delayed_work *dwork = to_delayed_work(work);
> +	struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
> +						  work_handler);
> +	int rc;
> +
> +	rc = create_dimm_temp_info(priv);
> +	if (rc && rc != -EAGAIN)
> +		dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
> +}
> +
> +static int check_cpu_id(struct peci_dimmtemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	u32 cpu_id;
> +	int i, rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_CPU_ID;
> +	msg.param = PKG_ID_CPU_ID;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
> +
> +	for (i = 0; i < CPU_GEN_MAX; i++) {
> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
> +			priv->gen_info = &cpu_gen_info_table[i];
> +			break;
> +		}
> +	}
> +
> +	if (!priv->gen_info)
> +		return -ENODEV;
> +
> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
> +	return 0;
> +}

More duplicate code.

> +
> +static int peci_dimmtemp_probe(struct peci_client *client)
> +{
> +	struct device *dev = &client->dev;
> +	struct peci_dimmtemp *priv;
> +	int rc;
> +
> +	if ((client->adapter->cmd_mask &
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {

One set of ( ) is unnecessary on each side of the expression.

> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
> +		return -EINVAL;

Why is this "invalid", and why does it warrant an error message ?

> +	}
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	dev_set_drvdata(dev, priv);
> +	priv->client = client;
> +	priv->dev = dev;
> +	priv->addr = client->addr;
> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;

Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
> +
> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
> +		 priv->cpu_no);
> +
> +	rc = check_cpu_id(priv);
> +	if (rc) {
> +		dev_err(dev, "Client CPU is not supported\n");

Or the peci command failed.

> +		return rc;
> +	}
> +
> +	priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
> +	if (!priv->work_queue)
> +		return -ENOMEM;
> +
> +	INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
> +
> +	rc = create_dimm_temp_info(priv);
> +	if (rc && rc != -EAGAIN) {
> +		dev_err(dev, "Failed to create DIMM temp info\n");
> +		goto err_free_wq;
> +	}
> +
> +	return 0;
> +
> +err_free_wq:
> +	destroy_workqueue(priv->work_queue);
> +	return rc;
> +}
> +
> +static int peci_dimmtemp_remove(struct peci_client *client)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
> +
> +	cancel_delayed_work(&priv->work_handler);

cancel_delayed_work_sync() ?

> +	destroy_workqueue(priv->work_queue);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id peci_dimmtemp_of_table[] = {
> +	{ .compatible = "intel,peci-dimmtemp" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
> +
> +static struct peci_driver peci_dimmtemp_driver = {
> +	.probe  = peci_dimmtemp_probe,
> +	.remove = peci_dimmtemp_remove,
> +	.driver = {
> +		.name           = "peci-dimmtemp",
> +		.of_match_table = of_match_ptr(peci_dimmtemp_of_table),
> +	},
> +};
> +module_peci_driver(peci_dimmtemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("PECI dimmtemp driver");
> +MODULE_LICENSE("GPL v2");
> -- 
> 2.16.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC bpf-next v2 3/8] bpf: add documentation for eBPF helpers (12-22)
From: Alexei Starovoitov @ 2018-04-10 22:43 UTC (permalink / raw)
  To: Quentin Monnet; +Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man
In-Reply-To: <20180410144157.4831-4-quentin.monnet@netronome.com>

On Tue, Apr 10, 2018 at 03:41:52PM +0100, Quentin Monnet wrote:
> Add documentation for eBPF helper functions to bpf.h user header file.
> This documentation can be parsed with the Python script provided in
> another commit of the patch series, in order to provide a RST document
> that can later be converted into a man page.
> 
> The objective is to make the documentation easily understandable and
> accessible to all eBPF developers, including beginners.
> 
> This patch contains descriptions for the following helper functions, all
> writter by Alexei:
> 
> - bpf_get_current_pid_tgid()
> - bpf_get_current_uid_gid()
> - bpf_get_current_comm()
> - bpf_skb_vlan_push()
> - bpf_skb_vlan_pop()
> - bpf_skb_get_tunnel_key()
> - bpf_skb_set_tunnel_key()
> - bpf_redirect()
> - bpf_perf_event_output()
> - bpf_get_stackid()
> - bpf_get_current_task()
> 
> Cc: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
>  include/uapi/linux/bpf.h | 237 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 237 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2bc653a3a20f..f3ea8824efbc 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -580,6 +580,243 @@ union bpf_attr {
>   * 		performed again.
>   * 	Return
>   * 		0 on success, or a negative error in case of failure.
> + *
> + * u64 bpf_get_current_pid_tgid(void)
> + * 	Return
> + * 		A 64-bit integer containing the current tgid and pid, and
> + * 		created as such:
> + * 		*current_task*\ **->tgid << 32 \|**
> + * 		*current_task*\ **->pid**.
> + *
> + * u64 bpf_get_current_uid_gid(void)
> + * 	Return
> + * 		A 64-bit integer containing the current GID and UID, and
> + * 		created as such: *current_gid* **<< 32 \|** *current_uid*.
> + *
> + * int bpf_get_current_comm(char *buf, u32 size_of_buf)
> + * 	Description
> + * 		Copy the **comm** attribute of the current task into *buf* of
> + * 		*size_of_buf*. The **comm** attribute contains the name of
> + * 		the executable (excluding the path) for the current task. The
> + * 		*size_of_buf* must be strictly positive. On success, the

that reminds me that we probably should relax it to ARG_CONST_SIZE_OR_ZERO.
The programs won't be passing an actual zero into it, but it helps
a lot to tell verifier that zero is also valid, since programs
become much simpler.

> + * 		helper makes sure that the *buf* is NUL-terminated. On failure,
> + * 		it is filled with zeroes.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
> + * 	Description
> + * 		Push a *vlan_tci* (VLAN tag control information) of protocol
> + * 		*vlan_proto* to the packet associated to *skb*, then update
> + * 		the checksum. Note that if *vlan_proto* is different from
> + * 		**ETH_P_8021Q** and **ETH_P_8021AD**, it is considered to
> + * 		be **ETH_P_8021Q**.
> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_vlan_pop(struct sk_buff *skb)
> + * 	Description
> + * 		Pop a VLAN header from the packet associated to *skb*.
> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_get_tunnel_key(struct sk_buff *skb, struct bpf_tunnel_key *key, u32 size, u64 flags)
> + * 	Description
> + * 		Get tunnel metadata. This helper takes a pointer *key* to an
> + * 		empty **struct bpf_tunnel_key** of **size**, that will be
> + * 		filled with tunnel metadata for the packet associated to *skb*.
> + * 		The *flags* can be set to **BPF_F_TUNINFO_IPV6**, which
> + * 		indicates that the tunnel is based on IPv6 protocol instead of
> + * 		IPv4.
> + *
> + * 		This is typically used on the receive path to perform a lookup
> + * 		or a packet redirection based on the value of *key*:

above is correct, but feels a bit cryptic.
May be give more concrete example for particular tunneling protocol like gre
and say that tunnel_key.remote_ip[46] is essential part of the encap and
bpf prog will make decisions based on the contents of the encap header
where bpf_tunnel_key is a single structure that generalizes parameters of
various tunneling protocols into one struct.

> + *
> + * 		::
> + *
> + * 			struct bpf_tunnel_key key = {};
> + * 			bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
> + * 			     lookup or redirect based on key ...
> + *
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_set_tunnel_key(struct sk_buff *skb, struct bpf_tunnel_key *key, u32 size, u64 flags)
> + * 	Description
> + * 		Populate tunnel metadata for packet associated to *skb.* The
> + * 		tunnel metadata is set to the contents of *key*, of *size*. The
> + * 		*flags* can be set to a combination of the following values:
> + *
> + * 		**BPF_F_TUNINFO_IPV6**
> + * 			Indicate that the tunnel is based on IPv6 protocol
> + * 			instead of IPv4.
> + * 		**BPF_F_ZERO_CSUM_TX**
> + * 			For IPv4 packets, add a flag to tunnel metadata
> + * 			indicating that checksum computation should be skipped
> + * 			and checksum set to zeroes.
> + * 		**BPF_F_DONT_FRAGMENT**
> + * 			Add a flag to tunnel metadata indicating that the
> + * 			packet should not be fragmented.
> + * 		**BPF_F_SEQ_NUMBER**
> + * 			Add a flag to tunnel metadata indicating that a
> + * 			sequence number should be added to tunnel header before
> + * 			sending the packet. This flag was added for GRE
> + * 			encapsulation, but might be used with other protocols
> + * 			as well in the future.
> + *
> + * 		Here is a typical usage on the transmit path:
> + *
> + * 		::
> + *
> + * 			struct bpf_tunnel_key key;
> + * 			     populate key ...
> + * 			bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
> + * 			bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
> + *
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_redirect(u32 ifindex, u64 flags)
> + * 	Description
> + * 		Redirect the packet to another net device of index *ifindex*.
> + * 		This helper is somewhat similar to **bpf_clone_redirect**\
> + * 		(), except that the packet is not cloned, which provides
> + * 		increased performance.
> + *
> + * 		For hooks other than XDP, *flags* can be set to
> + * 		**BPF_F_INGRESS**, which indicates the packet is to be
> + * 		redirected to the ingress interface instead of (by default)
> + * 		egress. Currently, XDP does not support any flag.
> + * 	Return
> + * 		For XDP, the helper returns **XDP_REDIRECT** on success or
> + * 		**XDP_ABORT** on error. For other program types, the values
> + * 		are **TC_ACT_REDIRECT** on success or **TC_ACT_SHOT** on
> + * 		error.
> + *
> + * int bpf_perf_event_output(struct pt_reg *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
> + * 	Description
> + * 		Write perf raw sample into a perf event held by *map* of type

I'd say:
Write raw *data* blob into special bpf perf event held by ...

> + * 		**BPF_MAP_TYPE_PERF_EVENT_ARRAY**. This perf event must
> + * 		have the following attributes: **PERF_SAMPLE_RAW** as
> + * 		**sample_type**, **PERF_TYPE_SOFTWARE** as **type**, and
> + * 		**PERF_COUNT_SW_BPF_OUTPUT** as **config**.
> + *
> + * 		The *flags* are used to indicate the index in *map* for which
> + * 		the value must be put, masked with **BPF_F_INDEX_MASK**.
> + * 		Alternatively, *flags* can be set to **BPF_F_CURRENT_CPU**
> + * 		to indicate that the index of the current CPU core should be
> + * 		used.
> + *
> + * 		The value to write, of *size*, is passed through eBPF stack and
> + * 		pointed by *data*.
> + *
> + * 		The context of the program *ctx* needs also be passed to the
> + * 		helper, and will get interpreted as a pointer to a **struct
> + * 		pt_reg**.

Not quite correct.
Initially bpf_perf_event_output() was only used with 'struct pt_reg *ctx',
but then later it was generalized for all other tracing prog types,
for clsact and even for XDP.
So 'ctx' can be any of the context used by these program types.

> + *
> + * 		On user space, a program willing to read the values needs to
> + * 		call **perf_event_open**\ () on the perf event (either for
> + * 		one or for all CPUs) and to store the file descriptor into the
> + * 		*map*. This must be done before the eBPF program can send data
> + * 		into it. An example is available in file
> + * 		*samples/bpf/trace_output_user.c* in the Linux kernel source
> + * 		tree (the eBPF program counterpart is in
> + * 		*samples/bpf/trace_output_kern.c*). It looks like the
> + * 		following snippet:
> + *
> + * 		::
> + *
> + * 			volatile struct perf_event_mmap_page *header;
> + * 			struct perf_event_attr attr = {
> + * 			        .sample_type = PERF_SAMPLE_RAW,
> + * 			        .type = PERF_TYPE_SOFTWARE,
> + * 			        .config = PERF_COUNT_SW_BPF_OUTPUT,
> + * 			};
> + * 			int page_size;
> + * 			int mmap_size;
> + * 			int key = 0;
> + * 			int pmu_fd;
> + * 			void *base;
> + * 			
> + * 			if (load_bpf_file(filename))
> + * 			        return -1;
> + * 			
> + * 			pmu_fd = sys_perf_event_open(&attr,
> + * 			                             -1, // pid
> + * 			                              0, // cpu
> + * 			                             -1, // group_fd
> + * 			                              0);
> + * 			
> + * 			assert(pmu_fd >= 0);
> + * 			assert(bpf_map_update_elem(map_fd[0], &key,
> + * 			                           &pmu_fd, BPF_ANY) == 0);
> + * 			assert(ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0) == 0);
> + * 			
> + * 			page_size = getpagesize();
> + * 			mmap_size = page_size * (page_cnt + 1);
> + * 			
> + * 			base = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
> + * 			            MAP_SHARED, fd, 0);
> + * 			if (base == MAP_FAILED)
> + * 			        return -1;
> + * 			
> + * 			header = base;

I think that is too much for the man page, especially above is far from
complete example.

> + *
> + * 		**bpf_perf_event_output**\ () achieves better performance
> + * 		than **bpf_trace_printk**\ () for sharing data with user
> + * 		space, and is much better suitable for streaming data from eBPF
> + * 		programs.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_get_stackid(struct pt_reg *ctx, struct bpf_map *map, u64 flags)
> + * 	Description
> + * 		Walk a user or a kernel stack and return its id. To achieve
> + * 		this, the helper needs *ctx*, which is a pointer to the context
> + * 		on which the tracing program is executed, and a pointer to a
> + * 		*map* of type **BPF_MAP_TYPE_STACK_TRACE**.
> + *
> + * 		The last argument, *flags*, holds the number of stack frames to
> + * 		skip (from 0 to 255), masked with
> + * 		**BPF_F_SKIP_FIELD_MASK**. The next bits can be used to set
> + * 		a combination of the following flags:
> + *
> + * 		**BPF_F_USER_STACK**
> + * 			Collect a user space stack instead of a kernel stack.
> + * 		**BPF_F_FAST_STACK_CMP**
> + * 			Compare stacks by hash only.
> + * 		**BPF_F_REUSE_STACKID**
> + * 			If two different stacks hash into the same *stackid*,
> + * 			discard the old one.

we have an annoying bug here that we will be sending a patch to fix soon,
since right now there is no way for the program to know that stackid
got replaced.

> + *
> + * 		The stack id retrieved is a 32 bit long integer handle which
> + * 		can be further combined with other data (including other stack
> + * 		ids) and used as a key into maps. This can be useful for
> + * 		generating a variety of graphs (such as flame graphs or off-cpu
> + * 		graphs).
> + *
> + * 		For walking a stack, this helper is an improvement over
> + * 		**bpf_probe_read**\ (), which can be used with unrolled loops
> + * 		but is not efficient and consumes a lot of eBPF instructions.
> + * 		Instead, **bpf_get_stackid**\ () can collect up to
> + * 		**PERF_MAX_STACK_DEPTH** both kernel and user frames.

PERF_MAX_STACK_DEPTH is now controlled by sysctl knob.
Would be good to mention that this limit can and should be increased
for profiling long user stacks like java.

> + * 	Return
> + * 		The positive or null stack id on success, or a negative error
> + * 		in case of failure.
> + *
> + * u64 bpf_get_current_task(void)
> + * 	Return
> + * 		A pointer to the current task struct.
>   */
>  #define __BPF_FUNC_MAPPER(FN)		\
>  	FN(unspec),			\
> -- 
> 2.14.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 1/2] mm: introduce ARCH_HAS_PTE_SPECIAL
From: David Rientjes @ 2018-04-10 20:51 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Matthew Wilcox, linux-kernel, linux-mm, linuxppc-dev, x86,
	linux-doc, linux-snps-arc, linux-arm-kernel, linux-riscv,
	linux-s390, linux-sh, sparclinux, Jerome Glisse, mhocko,
	aneesh.kumar, akpm, mpe, benh, paulus, Jonathan Corbet,
	Catalin Marinas, Will Deacon, Yoshinori Sato, Rich Felker,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Vineet Gupta,
	Palmer Dabbelt, Albert Ou, Martin Schwidefsky, Heiko Carstens
In-Reply-To: <a732ef2b-445f-9ad8-014b-247c8c5d500b@linux.vnet.ibm.com>

On Tue, 10 Apr 2018, Laurent Dufour wrote:

> > On Tue, Apr 10, 2018 at 05:25:50PM +0200, Laurent Dufour wrote:
> >>  arch/powerpc/include/asm/pte-common.h                  | 3 ---
> >>  arch/riscv/Kconfig                                     | 1 +
> >>  arch/s390/Kconfig                                      | 1 +
> > 
> > You forgot to delete __HAVE_ARCH_PTE_SPECIAL from
> > arch/riscv/include/asm/pgtable-bits.h
> 
> Damned !
> Thanks for catching it.
> 

Squashing the two patches together at least allowed it to be caught 
easily.  After it's fixed, feel free to add

	Acked-by: David Rientjes <rientjes@google.com>

Thanks for doing this!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 1/2] mm: introduce ARCH_HAS_PTE_SPECIAL
From: Palmer Dabbelt @ 2018-04-10 20:44 UTC (permalink / raw)
  To: willy
  Cc: ldufour, linux-kernel, linux-mm, linuxppc-dev, x86, linux-doc,
	linux-snps-arc, linux-arm-kernel, linux-riscv, linux-s390,
	linux-sh, sparclinux, jglisse, mhocko, aneesh.kumar, akpm, mpe,
	benh, paulus, corbet, catalin.marinas, Will Deacon, ysato, dalias,
	davem, tglx, mingo, vgupta, albert, schwidefsky, heiko.carstens,
	rientjes
In-Reply-To: <20180410160932.GB3614@bombadil.infradead.org>

On Tue, 10 Apr 2018 09:09:32 PDT (-0700), willy@infradead.org wrote:
> On Tue, Apr 10, 2018 at 05:25:50PM +0200, Laurent Dufour wrote:
>>  arch/powerpc/include/asm/pte-common.h                  | 3 ---
>>  arch/riscv/Kconfig                                     | 1 +
>>  arch/s390/Kconfig                                      | 1 +
>
> You forgot to delete __HAVE_ARCH_PTE_SPECIAL from
> arch/riscv/include/asm/pgtable-bits.h

Thanks -- I was looking for that but couldn't find it and assumed I'd just 
misunderstood something.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] gpiolib: add hogs support for machine code
From: Bartosz Golaszewski @ 2018-04-10 20:31 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, Linus Walleij, Jonathan Corbet, linux-gpio, linux-doc,
	Linux Kernel Mailing List
In-Reply-To: <201804110047.jKfmWBhd%fengguang.wu@intel.com>

2018-04-10 19:05 GMT+02:00 kbuild test robot <lkp@intel.com>:
> Hi Bartosz,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on gpio/for-next]
> [also build test ERROR on v4.16 next-20180410]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url:    https://github.com/0day-ci/linux/commits/Bartosz-Golaszewski/gpiolib-add-hogs-support-for-machine-code/20180410-232047
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git for-next
> config: i386-randconfig-a0-201814 (attached as .config)
> compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=i386
>
> All errors (new ones prefixed by >>):
>
>    In file included from drivers//mfd/sm501.c:23:0:
>>> include/linux/gpio/machine.h:56:19: error: field 'dflags' has incomplete type
>      enum gpiod_flags dflags;
>                       ^
>
> vim +/dflags +56 include/linux/gpio/machine.h
>
>     41
>     42  /**
>     43   * struct gpiod_hog - GPIO line hog table
>     44   * @chip_label: name of the chip the GPIO belongs to
>     45   * @chip_hwnum: hardware number (i.e. relative to the chip) of the GPIO
>     46   * @line_name: consumer name for the hogged line
>     47   * @lflags: mask of GPIO lookup flags
>     48   * @dflags: GPIO flags used to specify the direction and value
>     49   */
>     50  struct gpiod_hog {
>     51          struct list_head list;
>     52          const char *chip_label;
>     53          u16 chip_hwnum;
>     54          const char *line_name;
>     55          enum gpio_lookup_flags lflags;
>   > 56          enum gpiod_flags dflags;
>     57  };
>     58
>
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

Superseded by v2.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v2] gpiolib: add hogs support for machine code
From: Bartosz Golaszewski @ 2018-04-10 20:30 UTC (permalink / raw)
  To: Linus Walleij, Jonathan Corbet
  Cc: linux-gpio, linux-doc, linux-kernel, Bartosz Golaszewski

Board files constitute a significant part of the users of the legacy
GPIO framework. In many cases they only export a line and set its
desired value. We could use GPIO hogs for that like we do for DT and
ACPI but there's no support for that in machine code.

This patch proposes to extend the machine.h API with support for
registering hog tables in board files.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
---
v1 -> v2:
- kbuild bot complains about enum gpiod_flags having incomplete type
  although it builds fine for me locally: change the type of dflags
  to int

 Documentation/driver-api/gpio/board.rst | 16 ++++++
 drivers/gpio/gpiolib.c                  | 67 +++++++++++++++++++++++++
 include/linux/gpio/machine.h            | 31 ++++++++++++
 3 files changed, 114 insertions(+)

diff --git a/Documentation/driver-api/gpio/board.rst b/Documentation/driver-api/gpio/board.rst
index 25d62b2e9fd0..2c112553df84 100644
--- a/Documentation/driver-api/gpio/board.rst
+++ b/Documentation/driver-api/gpio/board.rst
@@ -177,3 +177,19 @@ mapping and is thus transparent to GPIO consumers.
 
 A set of functions such as gpiod_set_value() is available to work with
 the new descriptor-oriented interface.
+
+Boards using platform data can also hog GPIO lines by defining GPIO hog tables.
+
+.. code-block:: c
+
+        struct gpiod_hog gpio_hog_table[] = {
+                GPIO_HOG("gpio.0", 10, "foo", GPIO_ACTIVE_LOW, GPIOD_OUT_HIGH),
+                { }
+        };
+
+And the table can be added to the board code as follows::
+
+        gpiod_add_hogs(gpio_hog_table);
+
+The line will be hogged as soon as the gpiochip is created or - in case the
+chip was created earlier - when the hog table is registered.
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 43aeb07343ec..547adc149b62 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -71,6 +71,9 @@ static DEFINE_MUTEX(gpio_lookup_lock);
 static LIST_HEAD(gpio_lookup_list);
 LIST_HEAD(gpio_devices);
 
+static DEFINE_MUTEX(gpio_machine_hogs_mutex);
+static LIST_HEAD(gpio_machine_hogs);
+
 static void gpiochip_free_hogs(struct gpio_chip *chip);
 static int gpiochip_add_irqchip(struct gpio_chip *gpiochip,
 				struct lock_class_key *lock_key,
@@ -1171,6 +1174,41 @@ static int gpiochip_setup_dev(struct gpio_device *gdev)
 	return status;
 }
 
+static void gpiochip_machine_hog(struct gpio_chip *chip, struct gpiod_hog *hog)
+{
+	struct gpio_desc *desc;
+	int rv;
+
+	desc = gpiochip_get_desc(chip, hog->chip_hwnum);
+	if (IS_ERR(desc)) {
+		pr_err("%s: unable to get GPIO desc: %ld\n",
+		       __func__, PTR_ERR(desc));
+		return;
+	}
+
+	if (desc->flags & FLAG_IS_HOGGED)
+		return;
+
+	rv = gpiod_hog(desc, hog->line_name, hog->lflags, hog->dflags);
+	if (rv)
+		pr_err("%s: unable to hog GPIO line (%s:%u): %d\n",
+		       __func__, chip->label, hog->chip_hwnum, rv);
+}
+
+static void machine_gpiochip_add(struct gpio_chip *chip)
+{
+	struct gpiod_hog *hog;
+
+	mutex_lock(&gpio_machine_hogs_mutex);
+
+	list_for_each_entry(hog, &gpio_machine_hogs, list) {
+		if (!strcmp(chip->label, hog->chip_label))
+			gpiochip_machine_hog(chip, hog);
+	}
+
+	mutex_unlock(&gpio_machine_hogs_mutex);
+}
+
 static void gpiochip_setup_devs(void)
 {
 	struct gpio_device *gdev;
@@ -1326,6 +1364,8 @@ int gpiochip_add_data_with_key(struct gpio_chip *chip, void *data,
 
 	acpi_gpiochip_add(chip);
 
+	machine_gpiochip_add(chip);
+
 	/*
 	 * By first adding the chardev, and then adding the device,
 	 * we get a device node entry in sysfs under
@@ -3462,6 +3502,33 @@ void gpiod_remove_lookup_table(struct gpiod_lookup_table *table)
 }
 EXPORT_SYMBOL_GPL(gpiod_remove_lookup_table);
 
+/**
+ * gpiod_add_hogs() - register a set of GPIO hogs from machine code
+ * @hogs: table of gpio hog entries with a zeroed sentinel at the end
+ */
+void gpiod_add_hogs(struct gpiod_hog *hogs)
+{
+	struct gpio_chip *chip;
+	struct gpiod_hog *hog;
+
+	mutex_lock(&gpio_machine_hogs_mutex);
+
+	for (hog = &hogs[0]; hog->chip_label; hog++) {
+		list_add_tail(&hog->list, &gpio_machine_hogs);
+
+		/*
+		 * The chip may have been registered earlier, so check if it
+		 * exists and, if so, try to hog the line now.
+		 */
+		chip = find_chip_by_name(hog->chip_label);
+		if (chip)
+			gpiochip_machine_hog(chip, hog);
+	}
+
+	mutex_unlock(&gpio_machine_hogs_mutex);
+}
+EXPORT_SYMBOL_GPL(gpiod_add_hogs);
+
 static struct gpiod_lookup_table *gpiod_find_lookup_table(struct device *dev)
 {
 	const char *dev_id = dev ? dev_name(dev) : NULL;
diff --git a/include/linux/gpio/machine.h b/include/linux/gpio/machine.h
index b2f2dc638463..daa44eac9241 100644
--- a/include/linux/gpio/machine.h
+++ b/include/linux/gpio/machine.h
@@ -39,6 +39,23 @@ struct gpiod_lookup_table {
 	struct gpiod_lookup table[];
 };
 
+/**
+ * struct gpiod_hog - GPIO line hog table
+ * @chip_label: name of the chip the GPIO belongs to
+ * @chip_hwnum: hardware number (i.e. relative to the chip) of the GPIO
+ * @line_name: consumer name for the hogged line
+ * @lflags: mask of GPIO lookup flags
+ * @dflags: GPIO flags used to specify the direction and value
+ */
+struct gpiod_hog {
+	struct list_head list;
+	const char *chip_label;
+	u16 chip_hwnum;
+	const char *line_name;
+	enum gpio_lookup_flags lflags;
+	int dflags;
+};
+
 /*
  * Simple definition of a single GPIO under a con_id
  */
@@ -59,10 +76,23 @@ struct gpiod_lookup_table {
 	.flags = _flags,                                                  \
 }
 
+/*
+ * Simple definition of a single GPIO hog in an array.
+ */
+#define GPIO_HOG(_chip_label, _chip_hwnum, _line_name, _lflags, _dflags)  \
+{                                                                         \
+	.chip_label = _chip_label,                                        \
+	.chip_hwnum = _chip_hwnum,                                        \
+	.line_name = _line_name,                                          \
+	.lflags = _lflags,                                                \
+	.dflags = _dflags,                                                \
+}
+
 #ifdef CONFIG_GPIOLIB
 void gpiod_add_lookup_table(struct gpiod_lookup_table *table);
 void gpiod_add_lookup_tables(struct gpiod_lookup_table **tables, size_t n);
 void gpiod_remove_lookup_table(struct gpiod_lookup_table *table);
+void gpiod_add_hogs(struct gpiod_hog *hogs);
 #else
 static inline
 void gpiod_add_lookup_table(struct gpiod_lookup_table *table) {}
@@ -70,6 +100,7 @@ static inline
 void gpiod_add_lookup_tables(struct gpiod_lookup_table **tables, size_t n) {}
 static inline
 void gpiod_remove_lookup_table(struct gpiod_lookup_table *table) {}
+static inline void gpiod_add_hogs(struct gpiod_hog *hogs) {}
 #endif
 
 #endif /* __LINUX_GPIO_MACHINE_H */
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds documents of generic PECI bus, adapter and client drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
 .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
 .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++
 3 files changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
new file mode 100644
index 000000000000..9221374f6b11
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
@@ -0,0 +1,23 @@
+Generic device tree configuration for PECI adapters.
+
+Required properties:
+- compatible     : Should contain hardware specific definition strings that can
+		   match an adapter driver implementation.
+- reg            : Should contain PECI controller registers location and length.
+- #address-cells : Should be <1>.
+- #size-cells    : Should be <0>.
+
+Example:
+	peci: peci@10000000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x10000000 0x1000>;
+
+		peci0: peci-bus@0 {
+			compatible = "soc,soc-peci";
+			reg = <0x0 0x1000>;
+			#address-cells = <1>;
+			#size-cells = <0>;
+		};
+	};
diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
new file mode 100644
index 000000000000..90bcc791ccb0
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
@@ -0,0 +1,15 @@
+Generic device tree configuration for PECI buses.
+
+Required properties:
+- compatible     : Should be "simple-bus".
+- #address-cells : Should be <1>.
+- #size-cells    : Should be <1>.
+- ranges         : Should contain PECI controller registers ranges.
+
+Example:
+	peci: peci@10000000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x10000000 0x1000>;
+	};
diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
new file mode 100644
index 000000000000..8e2bfd8532f6
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-client.txt
@@ -0,0 +1,25 @@
+Generic device tree configuration for PECI clients.
+
+Required properties:
+- compatible : Should contain target device specific definition strings that can
+	       match a client driver implementation.
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		function@cpu0 {
+			compatible = "device,function";
+			reg = <0x30>;
+		};
+
+		function@cpu1 {
+			compatible = "device,function";
+			reg = <0x31>;
+		};
+	};
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 00/10] PECI device driver introduction
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

Introduction of the Platform Environment Control Interface (PECI) bus
device driver. PECI is a one-wire bus interface that provides a
communication channel between an Intel processor and chipset components to
external monitoring or control devices. PECI is designed to support the
following sideband functions:

* Processor and DRAM thermal management
  - Processor fan speed control is managed by comparing Digital Thermal
    Sensor (DTS) thermal readings acquired via PECI against the
    processor-specific fan speed control reference point, or TCONTROL. Both
    TCONTROL and DTS thermal readings are accessible via the processor PECI
    client. These variables are referenced to a common temperature, the TCC
    activation point, and are both defined as negative offsets from that
    reference.
  - PECI based access to the processor package configuration space provides
    a means for Baseboard Management Controllers (BMC) or other platform
    management devices to actively manage the processor and memory power
    and thermal features.

* Platform Manageability
  - Platform manageability functions including thermal, power, and error
    monitoring. Note that platform 'power' management includes monitoring
    and control for both the processor and DRAM subsystem to assist with
    data center power limiting.
  - PECI allows read access to certain error registers in the processor MSR
    space and status monitoring registers in the PCI configuration space
    within the processor and downstream devices.
  - PECI permits writes to certain registers in the processor PCI
    configuration space.

* Processor Interface Tuning and Diagnostics
  - Processor interface tuning and diagnostics capabilities
    (Intel Interconnect BIST). The processors Intel Interconnect Built In
    Self Test (Intel IBIST) allows for infield diagnostic capabilities in
    the Intel UPI and memory controller interfaces. PECI provides a port to
    execute these diagnostics via its PCI Configuration read and write
    capabilities.

* Failure Analysis
  - Output the state of the processor after a failure for analysis via
    Crashdump.

PECI uses a single wire for self-clocking and data transfer. The bus
requires no additional control lines. The physical layer is a self-clocked
one-wire bus that begins each bit with a driven, rising edge from an idle
level near zero volts. The duration of the signal driven high depends on
whether the bit value is a logic '0' or logic '1'. PECI also includes
variable data transfer rate established with every message. In this way, it
is highly flexible even though underlying logic is simple.

The interface design was optimized for interfacing between an Intel
processor and chipset components in both single processor and multiple
processor environments. The single wire interface provides low board
routing overhead for the multiple load connections in the congested routing
area near the processor and chipset components. Bus speed, error checking,
and low protocol overhead provides adequate link bandwidth and reliability
to transfer critical device operating conditions and configuration
information.

This implementation provides the basic framework to add PECI extensions to
the Linux bus and device models. A hardware specific 'Adapter' driver can
be attached to the PECI bus to provide sideband functions described above.
It is also possible to access all devices on an adapter from userspace
through the /dev interface. A device specific 'Client' driver also can be
attached to the PECI bus so each processor client's features can be
supported by the 'Client' driver through an adapter connection in the bus.
This patch set includes Aspeed 24xx/25xx PECI driver and PECI
cputemp/dimmtemp drivers as the first implementation for both adapter and
client drivers on the PECI bus framework.

Please review.

Thanks,

-Jae

Changes from v2:
* Divided peci-hwmon driver into two drivers, peci-cputemp and
  peci-dimmtemp.
* Added generic dt binding documents for PECI bus, adapter and client.
* Removed in_atomic() call from the PECI core driver.
* Improved PECI commands masking logic.
* Added permission check logic for PECI ioctls.
* Removed unnecessary type casts.
* Fixed some invalid error return codes.
* Added the mark_updated() function to improve update interval checking
  logic.
* Fixed a bug in populated DIMM checking function.
* Fixed some typo, grammar and style issues in documents.
* Rewrote hwmon drivers to use devm_hwmon_device_register_with_info API.
* Made peci_match_id() function as a static.
* Replaced a deprecated create_singlethread_workqueue() call with an
  alloc_ordered_workqueue() call.
* Reordered local variable definitions in reversed xmas tree notation.
* Listed up client CPUs that can be supported by peci-cputemp and
  peci-dimmtemp hwmon drivers.
* Added CPU generation detection logic which checks CPUID signature through
  PECI connection.
* Improved interrupt handling logic in the Aspeed PECI adapter driver.
* Fixed SPDX license identifier style in header files.
* Changed some macros in peci.h to static inline functions.
* Dropped sleepable context checking code in peci-core.
* Adjusted rt_mutex protection scope in peci-core.
* Moved adapter->xfer() checking code into peci_register_adapter().
* Improved PECI command retry checking logic.
* Changed ioctl base from 'P' to 0xb6 to avoid confiliction and updated
  ioctl-number.txt to reflect the ioctl number of PECI subsystem.
* Added a comment to describe PECI retry action.
* Simplified return code handling of peci_ioctl_ping().
* Changed type of peci_ioctl_fn[] to static const.
* Fixed range checking code for valid PECI commands.
* Fixed the error return code on invalid PECI commands.
* Fixed incorrect definitions of PECI ioctl and its handling logic.

Changes from v1:
* Additionally implemented a core driver to support PECI linux bus driver
  model.
* Modified Aspeed PECI driver to make that to be an adapter driver in PECI
  bus.
* Modified PECI hwmon driver to make that to be a client driver in PECI
  bus.
* Simplified hwmon driver attribute labels and removed redundant strings.
* Removed core_nums from device tree setting of hwmon driver and modified
  core number detection logic to check the resolved_core register in client
  CPU's local PCI configuration area.
* Removed dimm_nums from device tree setting of hwmon driver and added
  populated DIMM detection logic to support dynamic creation.
* Removed indexing gap on core temperature and DIMM temperature attributes.
* Improved hwmon registration and dynamic attribute creation logic.
* Fixed structure definitions in PECI uapi header to make that use __u8,
  __u16 and etc.
* Modified wait_for_completion_interruptible_timeout error handling logic
  in Aspeed PECI driver to deliver errors correctly.
* Removed low-level xfer command from ioctl and kept only high-level PECI
  command suite as ioctls.
* Fixed I/O timeout logic in Aspeed PECI driver using ktime.
* Added a function into hwmon driver to simplify update delay checking.
* Added a function into hwmon driver to convert 10.6 to millidegree.
* Dropped non-standard attributes in hwmon driver.
* Fixed OF table for hwmon to make it indicate as a PECI client of Intel
  CPU target.
* Added a maintainer of PECI subsystem into MAINTAINERS document.

Fengguang Wu (1):
  drivers/peci: Add support for PECI bus driver core


Jae Hyun Yoo (10):
  Documentations: dt-bindings: Add documents of generic PECI bus,
    adapter and client drivers
  Documentations: ioctl: Add ioctl numbers for PECI subsystem
  drivers/peci: Add support for PECI bus driver core
  Documentations: dt-bindings: Add a document of PECI adapter driver for
    Aspeed AST24xx/25xx SoCs
  ARM: dts: aspeed: peci: Add PECI node
  drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  Documentation: dt-bindings: Add documents for PECI hwmon client
    drivers
  Documentation: hwmon: Add documents for PECI hwmon client drivers
  drivers/hwmon: Add PECI hwmon client drivers
  Add a maintainer for the PECI subsystem

 .../devicetree/bindings/hwmon/peci-cputemp.txt     |   24 +
 .../devicetree/bindings/hwmon/peci-dimmtemp.txt    |   25 +
 .../devicetree/bindings/peci/peci-adapter.txt      |   23 +
 .../devicetree/bindings/peci/peci-aspeed.txt       |   60 +
 .../devicetree/bindings/peci/peci-bus.txt          |   15 +
 .../devicetree/bindings/peci/peci-client.txt       |   25 +
 Documentation/hwmon/peci-cputemp                   |   88 ++
 Documentation/hwmon/peci-dimmtemp                  |   50 +
 Documentation/ioctl/ioctl-number.txt               |    2 +
 MAINTAINERS                                        |   10 +
 arch/arm/boot/dts/aspeed-g4.dtsi                   |   25 +
 arch/arm/boot/dts/aspeed-g5.dtsi                   |   25 +
 drivers/Kconfig                                    |    2 +
 drivers/Makefile                                   |    1 +
 drivers/hwmon/Kconfig                              |   28 +
 drivers/hwmon/Makefile                             |    2 +
 drivers/hwmon/peci-cputemp.c                       |  783 ++++++++++++
 drivers/hwmon/peci-dimmtemp.c                      |  432 +++++++
 drivers/peci/Kconfig                               |   45 +
 drivers/peci/Makefile                              |    9 +
 drivers/peci/peci-aspeed.c                         |  504 ++++++++
 drivers/peci/peci-core.c                           | 1291 ++++++++++++++++++++
 include/linux/peci.h                               |  107 ++
 include/uapi/linux/peci-ioctl.h                    |  200 +++
 24 files changed, 3776 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
 create mode 100644 Documentation/hwmon/peci-cputemp
 create mode 100644 Documentation/hwmon/peci-dimmtemp
 create mode 100644 drivers/hwmon/peci-cputemp.c
 create mode 100644 drivers/hwmon/peci-dimmtemp.c
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/peci-aspeed.c
 create mode 100644 drivers/peci/peci-core.c
 create mode 100644 include/linux/peci.h
 create mode 100644 include/uapi/linux/peci-ioctl.h

-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v3 02/10] Documentations: ioctl: Add ioctl numbers for PECI subsystem
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit Updates ioctl-number.txt to reflect ioctl numbers being
used by the PECI subsystem.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Haiyue Wang <haiyue.wang@linux.intel.com>
Cc: James Feist <james.feist@linux.intel.com>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: Vernon Mauery <vernon.mauery@linux.intel.com>
---
 Documentation/ioctl/ioctl-number.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 84bb74dcae12..4bc3a65d7204 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -323,6 +323,8 @@ Code  Seq#(hex)	Include File		Comments
 0xB3	00	linux/mmc/ioctl.h
 0xB4	00-0F	linux/gpio.h		<mailto:linux-gpio@vger.kernel.org>
 0xB5	00-0F	uapi/linux/rpmsg.h	<mailto:linux-remoteproc@vger.kernel.org>
+0xB6	00-0F	uapi/linux/peci-ioctl.h	PECI subsystem
+					<mailto:jae.hyun.yoo@linux.intel.com>
 0xC0	00-0F	linux/usb/iowarrior.h
 0xCA	00-0F	uapi/misc/cxl.h
 0xCA	10-2F	uapi/misc/ocxl.h
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds driver implementation for PECI bus core into linux
driver framework.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/Kconfig                 |    2 +
 drivers/Makefile                |    1 +
 drivers/peci/Kconfig            |   17 +
 drivers/peci/Makefile           |    6 +
 drivers/peci/peci-core.c        | 1291 +++++++++++++++++++++++++++++++++++++++
 include/linux/peci.h            |  107 ++++
 include/uapi/linux/peci-ioctl.h |  200 ++++++
 7 files changed, 1624 insertions(+)
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/peci-core.c
 create mode 100644 include/linux/peci.h
 create mode 100644 include/uapi/linux/peci-ioctl.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index 95b9ccc08165..8c44d9738377 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -217,4 +217,6 @@ source "drivers/siox/Kconfig"
 
 source "drivers/slimbus/Kconfig"
 
+source "drivers/peci/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 24cd47014657..250fe3d0fa7e 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -185,3 +185,4 @@ obj-$(CONFIG_TEE)		+= tee/
 obj-$(CONFIG_MULTIPLEXER)	+= mux/
 obj-$(CONFIG_UNISYS_VISORBUS)	+= visorbus/
 obj-$(CONFIG_SIOX)		+= siox/
+obj-$(CONFIG_PECI)		+= peci/
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
new file mode 100644
index 000000000000..1fbc13f9e6c2
--- /dev/null
+++ b/drivers/peci/Kconfig
@@ -0,0 +1,17 @@
+#
+# Platform Environment Control Interface (PECI) subsystem configuration
+#
+
+menu "PECI support"
+
+config PECI
+	bool "PECI support"
+	select RT_MUTEXES
+	select CRC8
+	help
+	  The Platform Environment Control Interface (PECI) is a one-wire bus
+	  interface that provides a communication channel between Intel
+	  processors and chipset components to external monitoring or control
+	  devices.
+
+endmenu
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
new file mode 100644
index 000000000000..9e8615e0d3ff
--- /dev/null
+++ b/drivers/peci/Makefile
@@ -0,0 +1,6 @@
+#
+# Makefile for the PECI core and bus drivers.
+#
+
+# Core functionality
+obj-$(CONFIG_PECI)		+= peci-core.o
diff --git a/drivers/peci/peci-core.c b/drivers/peci/peci-core.c
new file mode 100644
index 000000000000..9b45869b7c39
--- /dev/null
+++ b/drivers/peci/peci-core.c
@@ -0,0 +1,1291 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/crc8.h>
+#include <linux/delay.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+/* Device Specific Completion Code (CC) Definition */
+#define DEV_PECI_CC_SUCCESS          0x40
+#define DEV_PECI_CC_TIMEOUT          0x80
+#define DEV_PECI_CC_OUT_OF_RESOURCE  0x81
+#define DEV_PECI_CC_UNAVAIL_RESOURCE 0x82
+#define DEV_PECI_CC_INVALID_REQ      0x90
+
+/* Completion Code mask to check retry needs */
+#define DEV_PECI_CC_RETRY_CHECK_MASK 0xf0
+#define DEV_PECI_CC_NEED_RETRY       0x80
+
+/* Skylake EDS says to retry for 250ms */
+#define DEV_PECI_RETRY_TIME_MS     250
+#define DEV_PECI_RETRY_INTERVAL_MS 10
+#define DEV_PECI_RETRY_BIT         0x01
+
+#define GET_TEMP_WR_LEN   1
+#define GET_TEMP_RD_LEN   2
+#define GET_TEMP_PECI_CMD 0x01
+
+#define GET_DIB_WR_LEN   1
+#define GET_DIB_RD_LEN   8
+#define GET_DIB_PECI_CMD 0xf7
+
+#define RDPKGCFG_WRITE_LEN     5
+#define RDPKGCFG_READ_LEN_BASE 1
+#define RDPKGCFG_PECI_CMD      0xa1
+
+#define WRPKGCFG_WRITE_LEN_BASE 6
+#define WRPKGCFG_READ_LEN       1
+#define WRPKGCFG_PECI_CMD       0xa5
+
+#define RDIAMSR_WRITE_LEN 5
+#define RDIAMSR_READ_LEN  9
+#define RDIAMSR_PECI_CMD  0xb1
+
+#define WRIAMSR_PECI_CMD  0xb5
+
+#define RDPCICFG_WRITE_LEN 6
+#define RDPCICFG_READ_LEN  5
+#define RDPCICFG_PECI_CMD  0x61
+
+#define WRPCICFG_PECI_CMD  0x65
+
+#define RDPCICFGLOCAL_WRITE_LEN     5
+#define RDPCICFGLOCAL_READ_LEN_BASE 1
+#define RDPCICFGLOCAL_PECI_CMD      0xe1
+
+#define WRPCICFGLOCAL_WRITE_LEN_BASE 6
+#define WRPCICFGLOCAL_READ_LEN       1
+#define WRPCICFGLOCAL_PECI_CMD       0xe5
+
+/* Macro for getting minor revision number from DIB */
+#define GET_MINOR_REV_NUM(x) (((x) >> 8) & 0xF)
+
+/* CRC8 table for Assure Write Frame Check */
+#define PECI_CRC8_POLYNOMIAL 0x07
+DECLARE_CRC8_TABLE(peci_crc8_table);
+
+static struct device_type peci_adapter_type;
+static struct device_type peci_client_type;
+
+/* Max number of peci cdev */
+#define PECI_CDEV_MAX    16
+
+/* Max index of devices sharing the same client address */
+#define PECI_DEV_IDX_MAX 16
+
+static dev_t peci_devt;
+static bool is_registered;
+
+static DEFINE_MUTEX(core_lock);
+static DEFINE_IDR(peci_adapter_idr);
+
+static ssize_t name_show(struct device *dev,
+			 struct device_attribute *attr,
+			 char *buf)
+{
+	return sprintf(buf, "%s\n", dev->type == &peci_client_type ?
+		       to_peci_client(dev)->name : to_peci_adapter(dev)->name);
+}
+static DEVICE_ATTR_RO(name);
+
+static void peci_client_dev_release(struct device *dev)
+{
+	kfree(to_peci_client(dev));
+}
+
+static struct attribute *peci_device_attrs[] = {
+	&dev_attr_name.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(peci_device);
+
+static struct device_type peci_client_type = {
+	.groups		= peci_device_groups,
+	.release	= peci_client_dev_release,
+};
+
+static struct peci_client *peci_verify_client(struct device *dev)
+{
+	return (dev->type == &peci_client_type)
+			? to_peci_client(dev)
+			: NULL;
+}
+
+static void peci_adapter_dev_release(struct device *dev)
+{
+	/* do nothing */
+}
+
+static struct attribute *peci_adapter_attrs[] = {
+	&dev_attr_name.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(peci_adapter);
+
+static struct device_type peci_adapter_type = {
+	.groups		= peci_adapter_groups,
+	.release	= peci_adapter_dev_release,
+};
+
+static struct peci_adapter *peci_verify_adapter(struct device *dev)
+{
+	return (dev->type == &peci_adapter_type)
+			? to_peci_adapter(dev)
+			: NULL;
+}
+
+static struct peci_adapter *peci_get_adapter(int nr)
+{
+	struct peci_adapter *adapter;
+
+	mutex_lock(&core_lock);
+	adapter = idr_find(&peci_adapter_idr, nr);
+	if (!adapter)
+		goto out_unlock;
+
+	if (try_module_get(adapter->owner))
+		get_device(&adapter->dev);
+	else
+		adapter = NULL;
+
+out_unlock:
+	mutex_unlock(&core_lock);
+	return adapter;
+}
+
+static void peci_put_adapter(struct peci_adapter *adapter)
+{
+	if (!adapter)
+		return;
+
+	put_device(&adapter->dev);
+	module_put(adapter->owner);
+}
+
+static u8 peci_aw_fcs(u8 *data, int len)
+{
+	return crc8(peci_crc8_table, data, (size_t)len, 0);
+}
+
+static int __peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg,
+		       bool do_retry, bool has_aw_fcs)
+{
+	ktime_t start, end;
+	s64 elapsed_ms;
+	int rc = 0;
+
+	/**
+	 * For some commands, the PECI originator may need to retry a command if
+	 * the processor PECI client responds with a 0x8x completion code. In
+	 * each instance, the processor PECI client may have started the
+	 * operation but not completed it yet. When the 'retry' bit is set, the
+	 * PECI client will ignore a new request if it exactly matches a
+	 * previous valid request.
+	 */
+
+	if (do_retry)
+		start = ktime_get();
+
+	do {
+		rc = adapter->xfer(adapter, msg);
+
+		if (!do_retry || rc)
+			break;
+
+		if (msg->rx_buf[0] == DEV_PECI_CC_SUCCESS)
+			break;
+
+		/* Retry is needed when completion code is 0x8x */
+		if ((msg->rx_buf[0] & DEV_PECI_CC_RETRY_CHECK_MASK) !=
+		    DEV_PECI_CC_NEED_RETRY) {
+			rc = -EIO;
+			break;
+		}
+
+		/* Set the retry bit to indicate a retry attempt */
+		msg->tx_buf[1] |= DEV_PECI_RETRY_BIT;
+
+		/* Recalculate the AW FCS if it has one */
+		if (has_aw_fcs)
+			msg->tx_buf[msg->tx_len - 1] = 0x80 ^
+						peci_aw_fcs((u8 *)msg,
+							    2 + msg->tx_len);
+
+		/**
+		 * Retry for at least 250ms before returning an error.
+		 * Retry interval guideline:
+		 *   No minimum < Retry Interval < No maximum
+		 *                (recommend 10ms)
+		 */
+		end = ktime_get();
+		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+		if (elapsed_ms >= DEV_PECI_RETRY_TIME_MS) {
+			dev_dbg(&adapter->dev, "Timeout retrying xfer!\n");
+			rc = -ETIMEDOUT;
+			break;
+		}
+
+		usleep_range(DEV_PECI_RETRY_INTERVAL_MS * 1000,
+			     (DEV_PECI_RETRY_INTERVAL_MS * 1000) + 1000);
+	} while (true);
+
+	if (rc)
+		dev_dbg(&adapter->dev, "xfer error, rc: %d\n", rc);
+
+	return rc;
+}
+
+static int peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg)
+{
+	return __peci_xfer(adapter, msg, false, false);
+}
+
+static int peci_xfer_with_retries(struct peci_adapter *adapter,
+				  struct peci_xfer_msg *msg,
+				  bool has_aw_fcs)
+{
+	return __peci_xfer(adapter, msg, true, has_aw_fcs);
+}
+
+static int peci_scan_cmd_mask(struct peci_adapter *adapter)
+{
+	struct peci_xfer_msg msg;
+	int rc = 0;
+	u32 dib;
+
+	/* Update command mask just once */
+	if (adapter->cmd_mask & BIT(PECI_CMD_PING))
+		return 0;
+
+	msg.addr      = PECI_BASE_ADDR;
+	msg.tx_len    = GET_DIB_WR_LEN;
+	msg.rx_len    = GET_DIB_RD_LEN;
+	msg.tx_buf[0] = GET_DIB_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
+	      (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
+
+	/* Check special case for Get DIB command */
+	if (dib == 0x00) {
+		dev_dbg(&adapter->dev, "DIB read as 0x00\n");
+		return -EIO;
+	}
+
+	/**
+	 * Setting up the supporting commands based on minor revision number.
+	 * See PECI Spec Table 3-1.
+	 */
+	switch (GET_MINOR_REV_NUM(dib)) {
+	case 6:
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_IA_MSR);
+		/* fallthrough */
+	case 5:
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG);
+		/* fallthrough */
+	case 4:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG);
+		/* fallthrough */
+	case 3:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG_LOCAL);
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG_LOCAL);
+		/* fallthrough */
+	case 2:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_IA_MSR);
+		/* fallthrough */
+	case 1:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PKG_CFG);
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PKG_CFG);
+	}
+
+	adapter->cmd_mask |= BIT(PECI_CMD_GET_TEMP);
+	adapter->cmd_mask |= BIT(PECI_CMD_GET_DIB);
+	adapter->cmd_mask |= BIT(PECI_CMD_PING);
+
+	return rc;
+}
+
+static int peci_cmd_support(struct peci_adapter *adapter, enum peci_cmd cmd)
+{
+	if (!(adapter->cmd_mask & BIT(PECI_CMD_PING)) &&
+	    peci_scan_cmd_mask(adapter) < 0) {
+		dev_dbg(&adapter->dev, "Failed to scan command mask\n");
+		return -EIO;
+	}
+
+	if (!(adapter->cmd_mask & BIT(cmd))) {
+		dev_dbg(&adapter->dev, "Command %d is not supported\n", cmd);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int peci_ioctl_ping(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_ping_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+
+	msg.addr   = umsg->addr;
+	msg.tx_len = 0;
+	msg.rx_len = 0;
+
+	return peci_xfer(adapter, &msg);
+}
+
+static int peci_ioctl_get_dib(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_get_dib_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc;
+
+	msg.addr      = umsg->addr;
+	msg.tx_len    = GET_DIB_WR_LEN;
+	msg.rx_len    = GET_DIB_RD_LEN;
+	msg.tx_buf[0] = GET_DIB_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	umsg->dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
+		     (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
+
+	return 0;
+}
+
+static int peci_ioctl_get_temp(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_get_temp_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc;
+
+	msg.addr      = umsg->addr;
+	msg.tx_len    = GET_TEMP_WR_LEN;
+	msg.rx_len    = GET_TEMP_RD_LEN;
+	msg.tx_buf[0] = GET_TEMP_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	umsg->temp_raw = msg.rx_buf[0] | (msg.rx_buf[1] << 8);
+
+	return 0;
+}
+
+static int peci_ioctl_rd_pkg_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pkg_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0;
+
+	/* Per the PECI spec, the read length must be a byte, word, or dword */
+	if (umsg->rx_len != 1 && umsg->rx_len != 2 && umsg->rx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid read length, rx_len: %d\n",
+			umsg->rx_len);
+		return -EINVAL;
+	}
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPKGCFG_WRITE_LEN;
+	/* read lengths of 1 and 2 result in an error, so only use 4 for now */
+	msg.rx_len = RDPKGCFG_READ_LEN_BASE + umsg->rx_len;
+	msg.tx_buf[0] = RDPKGCFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = umsg->index;            /* RdPkgConfig index */
+	msg.tx_buf[3] = (u8)umsg->param;        /* LSB - Config parameter */
+	msg.tx_buf[4] = (u8)(umsg->param >> 8); /* MSB - Config parameter */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pkg_config, &msg.rx_buf[1], umsg->rx_len);
+
+	return rc;
+}
+
+static int peci_ioctl_wr_pkg_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_wr_pkg_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0, i;
+
+	/* Per the PECI spec, the write length must be a dword */
+	if (umsg->tx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid write length, tx_len: %d\n",
+			umsg->tx_len);
+		return -EINVAL;
+	}
+
+	msg.addr = umsg->addr;
+	msg.tx_len = WRPKGCFG_WRITE_LEN_BASE + umsg->tx_len;
+	/* read lengths of 1 and 2 result in an error, so only use 4 for now */
+	msg.rx_len = WRPKGCFG_READ_LEN;
+	msg.tx_buf[0] = WRPKGCFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = umsg->index;            /* RdPkgConfig index */
+	msg.tx_buf[3] = (u8)umsg->param;        /* LSB - Config parameter */
+	msg.tx_buf[4] = (u8)(umsg->param >> 8); /* MSB - Config parameter */
+	for (i = 0; i < umsg->tx_len; i++)
+		msg.tx_buf[5 + i] = (u8)(umsg->value >> (i << 3));
+
+	/* Add an Assure Write Frame Check Sequence byte */
+	msg.tx_buf[5 + i] = 0x80 ^
+			    peci_aw_fcs((u8 *)&msg, 8 + umsg->tx_len);
+
+	rc = peci_xfer_with_retries(adapter, &msg, true);
+
+	return rc;
+}
+
+static int peci_ioctl_rd_ia_msr(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_ia_msr_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0;
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDIAMSR_WRITE_LEN;
+	msg.rx_len = RDIAMSR_READ_LEN;
+	msg.tx_buf[0] = RDIAMSR_PECI_CMD;
+	msg.tx_buf[1] = 0x00;
+	msg.tx_buf[2] = umsg->thread_id;
+	msg.tx_buf[3] = (u8)umsg->address;
+	msg.tx_buf[4] = (u8)(umsg->address >> 8);
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(&umsg->value, &msg.rx_buf[1], sizeof(uint64_t));
+
+	return rc;
+}
+
+static int peci_ioctl_rd_pci_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pci_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	u32 address;
+	int rc = 0;
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [27:20] - Bus      */
+					      /* [31:28] - Reserved */
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPCICFG_WRITE_LEN;
+	msg.rx_len = RDPCICFG_READ_LEN;
+	msg.tx_buf[0] = RDPCICFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;         /* LSB - PCI Config Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Config Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Config Address */
+	msg.tx_buf[5] = (u8)(address >> 24); /* MSB - PCI Config Address */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pci_config, &msg.rx_buf[1], 4);
+
+	return rc;
+}
+
+static int peci_ioctl_rd_pci_cfg_local(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pci_cfg_local_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	u32 address;
+	int rc = 0;
+
+	/* Per the PECI spec, the read length must be a byte, word, or dword */
+	if (umsg->rx_len != 1 && umsg->rx_len != 2 && umsg->rx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid read length, rx_len: %d\n",
+			umsg->rx_len);
+		return -EINVAL;
+	}
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [23:20] - Bus      */
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPCICFGLOCAL_WRITE_LEN;
+	msg.rx_len = RDPCICFGLOCAL_READ_LEN_BASE + umsg->rx_len;
+	msg.tx_buf[0] = RDPCICFGLOCAL_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;       /* LSB - PCI Configuration Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Configuration Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Configuration Address */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pci_config, &msg.rx_buf[1], umsg->rx_len);
+
+	return rc;
+}
+
+static int peci_ioctl_wr_pci_cfg_local(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_wr_pci_cfg_local_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0, i;
+	u32 address;
+
+	/* Per the PECI spec, the write length must be a byte, word, or dword */
+	if (umsg->tx_len != 1 && umsg->tx_len != 2 && umsg->tx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid write length, tx_len: %d\n",
+			umsg->tx_len);
+		return -EINVAL;
+	}
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [23:20] - Bus      */
+
+	msg.addr = umsg->addr;
+	msg.tx_len = WRPCICFGLOCAL_WRITE_LEN_BASE + umsg->tx_len;
+	msg.rx_len = WRPCICFGLOCAL_READ_LEN;
+	msg.tx_buf[0] = WRPCICFGLOCAL_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;       /* LSB - PCI Configuration Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Configuration Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Configuration Address */
+	for (i = 0; i < umsg->tx_len; i++)
+		msg.tx_buf[5 + i] = (u8)(umsg->value >> (i << 3));
+
+	/* Add an Assure Write Frame Check Sequence byte */
+	msg.tx_buf[5 + i] = 0x80 ^
+			    peci_aw_fcs((u8 *)&msg, 8 + umsg->tx_len);
+
+	rc = peci_xfer_with_retries(adapter, &msg, true);
+
+	return rc;
+}
+
+typedef int (*peci_ioctl_fn_type)(struct peci_adapter *, void *);
+
+static const peci_ioctl_fn_type peci_ioctl_fn[PECI_CMD_MAX] = {
+	NULL, /* Reserved */
+	peci_ioctl_ping,
+	peci_ioctl_get_dib,
+	peci_ioctl_get_temp,
+	peci_ioctl_rd_pkg_cfg,
+	peci_ioctl_wr_pkg_cfg,
+	peci_ioctl_rd_ia_msr,
+	NULL, /* Reserved */
+	peci_ioctl_rd_pci_cfg,
+	NULL, /* Reserved */
+	peci_ioctl_rd_pci_cfg_local,
+	peci_ioctl_wr_pci_cfg_local,
+};
+
+int peci_command(struct peci_adapter *adapter, enum peci_cmd cmd, void *vmsg)
+{
+	int rc = 0;
+
+	if (cmd >= PECI_CMD_MAX || cmd < PECI_CMD_XFER)
+		return -EINVAL;
+
+	dev_dbg(&adapter->dev, "%s, cmd=0x%02x\n", __func__, cmd);
+
+	if (!peci_ioctl_fn[cmd])
+		return -EINVAL;
+
+	rt_mutex_lock(&adapter->bus_lock);
+
+	rc = peci_cmd_support(adapter, cmd);
+	if (!rc)
+		rc = peci_ioctl_fn[cmd](adapter, vmsg);
+
+	rt_mutex_unlock(&adapter->bus_lock);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(peci_command);
+
+static long peci_ioctl(struct file *file, unsigned int iocmd, unsigned long arg)
+{
+	struct peci_adapter *adapter = file->private_data;
+	void __user *argp = (void __user *)arg;
+	unsigned int msg_len;
+	enum peci_cmd cmd;
+	int rc = 0;
+	u8 *msg;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	dev_dbg(&adapter->dev, "ioctl, cmd=0x%x, arg=0x%lx\n", iocmd, arg);
+
+	switch (iocmd) {
+	case PECI_IOC_PING:
+	case PECI_IOC_GET_DIB:
+	case PECI_IOC_GET_TEMP:
+	case PECI_IOC_RD_PKG_CFG:
+	case PECI_IOC_WR_PKG_CFG:
+	case PECI_IOC_RD_IA_MSR:
+	case PECI_IOC_RD_PCI_CFG:
+	case PECI_IOC_RD_PCI_CFG_LOCAL:
+	case PECI_IOC_WR_PCI_CFG_LOCAL:
+		cmd = _IOC_NR(iocmd);
+		msg_len = _IOC_SIZE(iocmd);
+		break;
+
+	default:
+		dev_dbg(&adapter->dev, "Invalid ioctl cmd : 0x%x\n", iocmd);
+		return -ENOTTY;
+	}
+
+	msg = memdup_user(argp, msg_len);
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	rc = peci_command(adapter, cmd, msg);
+
+	if (!rc && copy_to_user(argp, msg, msg_len))
+		rc = -EFAULT;
+
+	kfree(msg);
+	return (long)rc;
+}
+
+static int peci_open(struct inode *inode, struct file *file)
+{
+	unsigned int minor = iminor(inode);
+	struct peci_adapter *adapter;
+
+	adapter = peci_get_adapter(minor);
+	if (!adapter)
+		return -ENODEV;
+
+	file->private_data = adapter;
+
+	return 0;
+}
+
+static int peci_release(struct inode *inode, struct file *file)
+{
+	struct peci_adapter *adapter = file->private_data;
+
+	peci_put_adapter(adapter);
+	file->private_data = NULL;
+
+	return 0;
+}
+
+static const struct file_operations peci_fops = {
+	.owner          = THIS_MODULE,
+	.unlocked_ioctl = peci_ioctl,
+	.open           = peci_open,
+	.release        = peci_release,
+};
+
+static int peci_detect(struct peci_adapter *adapter, u8 addr)
+{
+	struct peci_ping_msg msg;
+
+	msg.addr = addr;
+
+	return peci_command(adapter, PECI_CMD_PING, &msg);
+}
+
+#if IS_ENABLED(CONFIG_OF)
+static const struct of_device_id *
+peci_of_match_device(const struct of_device_id *matches,
+		     struct peci_client *client)
+{
+	if (!(client && matches))
+		return NULL;
+
+	return of_match_device(matches, &client->dev);
+}
+#endif
+
+static const struct peci_device_id *
+peci_match_id(const struct peci_device_id *id, struct peci_client *client)
+{
+	if (!(id && client))
+		return NULL;
+
+	while (id->name[0]) {
+		if (strcmp(client->name, id->name) == 0)
+			return id;
+		id++;
+	}
+
+	return NULL;
+}
+
+static int peci_device_match(struct device *dev, struct device_driver *drv)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+
+	/* Attempt an OF style match */
+	if (peci_of_match_device(drv->of_match_table, client))
+		return 1;
+
+	driver = to_peci_driver(drv);
+
+	if (peci_match_id(driver->id_table, client))
+		return 1;
+
+	return 0;
+}
+
+static int peci_device_probe(struct device *dev)
+{
+	struct peci_client	*client = peci_verify_client(dev);
+	struct peci_driver	*driver;
+	int status = -EINVAL;
+
+	if (!client)
+		return 0;
+
+	if (!peci_of_match_device(dev->driver->of_match_table, client))
+		return -ENODEV;
+
+	dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->probe)
+		status = driver->probe(client);
+
+	return status;
+}
+
+static int peci_device_remove(struct device *dev)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+	int status = 0;
+
+	if (!client || !dev->driver)
+		return 0;
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->remove) {
+		dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+		status = driver->remove(client);
+	}
+
+	return status;
+}
+
+static void peci_device_shutdown(struct device *dev)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+
+	if (!client || !dev->driver)
+		return;
+
+	dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->shutdown)
+		driver->shutdown(client);
+}
+
+static struct bus_type peci_bus_type = {
+	.name		= "peci",
+	.match		= peci_device_match,
+	.probe		= peci_device_probe,
+	.remove		= peci_device_remove,
+	.shutdown	= peci_device_shutdown,
+};
+
+static void peci_unregister_device(struct peci_client *client)
+{
+	if (client->dev.of_node)
+		of_node_clear_flag(client->dev.of_node, OF_POPULATED);
+
+	device_unregister(&client->dev);
+}
+
+static int peci_check_addr_validity(u8 addr)
+{
+	if (addr < PECI_BASE_ADDR && addr > PECI_BASE_ADDR + PECI_OFFSET_MAX)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int peci_check_client_busy(struct device *dev, void *client_new_p)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_client *client_new = client_new_p;
+
+	if (client && client->addr == client_new->addr &&
+	    client->idx == client_new->idx)
+		return -EBUSY;
+
+	return 0;
+}
+
+static struct peci_client *peci_new_device(struct peci_adapter *adapter,
+					   struct peci_board_info const *info)
+{
+	struct peci_client *client;
+	int rc;
+
+	client = kzalloc(sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return NULL;
+
+	client->adapter = adapter;
+	client->addr = info->addr;
+	strlcpy(client->name, info->type, sizeof(client->name));
+
+	rc = peci_check_addr_validity(client->addr);
+	if (rc) {
+		dev_err(&adapter->dev, "Invalid PECI CPU address 0x%02hx\n",
+			client->addr);
+		goto err_free_client_silent;
+	}
+
+	/* Check client's online status */
+	rc = peci_detect(adapter, client->addr);
+	if (rc)
+		goto err_free_client;
+
+	for (client->idx = 0; client->idx < PECI_DEV_IDX_MAX; client->idx++) {
+		rc = device_for_each_child(&adapter->dev, client,
+					   peci_check_client_busy);
+		if (!rc)
+			break;
+	}
+
+	if (rc || client->idx == PECI_DEV_IDX_MAX)
+		goto err_free_client;
+
+	client->dev.parent = &client->adapter->dev;
+	client->dev.bus = &peci_bus_type;
+	client->dev.type = &peci_client_type;
+	client->dev.of_node = info->of_node;
+	dev_set_name(&client->dev, "%d-%02x:%02x",
+		     adapter->nr, client->addr, client->idx);
+
+	rc = device_register(&client->dev);
+	if (rc)
+		goto err_free_client;
+
+	dev_dbg(&adapter->dev, "client [%s] registered with bus id %s\n",
+		client->name, dev_name(&client->dev));
+
+	return client;
+
+err_free_client:
+	dev_err(&adapter->dev,
+		"Failed to register peci client %s at 0x%02x (%d)\n",
+		client->name, client->addr, rc);
+err_free_client_silent:
+	kfree(client);
+	return NULL;
+}
+
+#if IS_ENABLED(CONFIG_OF)
+static struct peci_client *peci_of_register_device(struct peci_adapter *adapter,
+						   struct device_node *node)
+{
+	struct peci_board_info info = {};
+	struct peci_client *result;
+	const __be32 *addr_be;
+	u32 addr;
+	int len;
+
+	dev_dbg(&adapter->dev, "register %s\n", node->full_name);
+
+	if (of_modalias_node(node, info.type, sizeof(info.type)) < 0) {
+		dev_err(&adapter->dev, "modalias failure on %s\n",
+			node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	addr_be = of_get_property(node, "reg", &len);
+	if (!addr_be || len < sizeof(*addr_be)) {
+		dev_err(&adapter->dev, "invalid reg on %s\n",
+			node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	addr = be32_to_cpup(addr_be);
+
+	if (peci_check_addr_validity(addr)) {
+		dev_err(&adapter->dev, "invalid addr=%x on %s\n",
+			addr, node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	info.addr = addr;
+	info.of_node = of_node_get(node);
+
+	result = peci_new_device(adapter, &info);
+	if (!result)
+		result = ERR_PTR(-EINVAL);
+
+	of_node_put(node);
+	return result;
+}
+
+static void peci_of_register_devices(struct peci_adapter *adapter)
+{
+	struct device_node *bus, *node;
+	struct peci_client *client;
+
+	/* Only register child devices if the adapter has a node pointer set */
+	if (!adapter->dev.of_node)
+		return;
+
+	bus = of_get_child_by_name(adapter->dev.of_node, "peci-bus");
+	if (!bus)
+		bus = of_node_get(adapter->dev.of_node);
+
+	for_each_available_child_of_node(bus, node) {
+		if (of_node_test_and_set_flag(node, OF_POPULATED))
+			continue;
+
+		client = peci_of_register_device(adapter, node);
+		if (IS_ERR(client)) {
+			dev_warn(&adapter->dev,
+				 "Failed to create PECI device for %s\n",
+				 node->full_name);
+			of_node_clear_flag(node, OF_POPULATED);
+		}
+	}
+
+	of_node_put(bus);
+}
+
+static int peci_of_match_node(struct device *dev, void *data)
+{
+	return dev->of_node == data;
+}
+
+/* must call put_device() when done with returned peci_client device */
+static struct peci_client *peci_of_find_device(struct device_node *node)
+{
+	struct peci_client *client;
+	struct device *dev;
+
+	dev = bus_find_device(&peci_bus_type, NULL, node, peci_of_match_node);
+	if (!dev)
+		return NULL;
+
+	client = peci_verify_client(dev);
+	if (!client)
+		put_device(dev);
+
+	return client;
+}
+
+/* must call put_device() when done with returned peci_adapter device */
+static struct peci_adapter *peci_of_find_adapter(struct device_node *node)
+{
+	struct peci_adapter *adapter;
+	struct device *dev;
+
+	dev = bus_find_device(&peci_bus_type, NULL, node, peci_of_match_node);
+	if (!dev)
+		return NULL;
+
+	adapter = peci_verify_adapter(dev);
+	if (!adapter)
+		put_device(dev);
+
+	return adapter;
+}
+#else
+static void peci_of_register_devices(struct peci_adapter *adapter) { }
+#endif /* CONFIG_OF */
+
+#if IS_ENABLED(CONFIG_OF_DYNAMIC)
+static int peci_of_notify(struct notifier_block *nb,
+			  unsigned long action,
+			  void *arg)
+{
+	struct of_reconfig_data *rd = arg;
+	struct peci_adapter *adapter;
+	struct peci_client *client;
+
+	switch (of_reconfig_get_state_change(action, rd)) {
+	case OF_RECONFIG_CHANGE_ADD:
+		adapter = peci_of_find_adapter(rd->dn->parent);
+		if (!adapter)
+			return NOTIFY_OK;	/* not for us */
+
+		if (of_node_test_and_set_flag(rd->dn, OF_POPULATED)) {
+			put_device(&adapter->dev);
+			return NOTIFY_OK;
+		}
+
+		client = peci_of_register_device(adapter, rd->dn);
+		put_device(&adapter->dev);
+
+		if (IS_ERR(client)) {
+			dev_err(&adapter->dev,
+				"failed to create client for '%s'\n",
+				rd->dn->full_name);
+			of_node_clear_flag(rd->dn, OF_POPULATED);
+			return notifier_from_errno(PTR_ERR(client));
+		}
+		break;
+	case OF_RECONFIG_CHANGE_REMOVE:
+		/* already depopulated? */
+		if (!of_node_check_flag(rd->dn, OF_POPULATED))
+			return NOTIFY_OK;
+
+		/* find our device by node */
+		client = peci_of_find_device(rd->dn);
+		if (!client)
+			return NOTIFY_OK;	/* no? not meant for us */
+
+		/* unregister takes one ref away */
+		peci_unregister_device(client);
+
+		/* and put the reference of the find */
+		put_device(&client->dev);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block peci_of_notifier = {
+	.notifier_call = peci_of_notify,
+};
+#else
+extern struct notifier_block peci_of_notifier;
+#endif /* CONFIG_OF_DYNAMIC */
+
+static int peci_register_adapter(struct peci_adapter *adapter)
+{
+	int rc = -EINVAL;
+
+	/* Can't register until after driver model init */
+	if (WARN_ON(!is_registered))
+		goto err_free_idr;
+
+	if (WARN(!adapter->name[0], "peci adapter has no name"))
+		goto err_free_idr;
+
+	if (WARN(!adapter->xfer, "peci adapter has no xfer function\n"))
+		goto err_free_idr;
+
+	rt_mutex_init(&adapter->bus_lock);
+
+	dev_set_name(&adapter->dev, "peci%d", adapter->nr);
+	adapter->dev.bus = &peci_bus_type;
+	adapter->dev.type = &peci_adapter_type;
+	device_initialize(&adapter->dev);
+
+	/* cdev */
+	cdev_init(&adapter->cdev, &peci_fops);
+	adapter->cdev.owner = THIS_MODULE;
+	adapter->cdev.kobj.parent = &adapter->dev.kobj;
+	adapter->dev.devt = MKDEV(MAJOR(peci_devt), adapter->nr);
+	rc = cdev_add(&adapter->cdev, adapter->dev.devt, 1);
+	if (rc) {
+		pr_err("adapter '%s': can't add cdev (%d)\n",
+		       adapter->name, rc);
+		goto err_free_idr;
+	}
+	rc = device_add(&adapter->dev);
+	if (rc) {
+		pr_err("adapter '%s': can't add device (%d)\n",
+		       adapter->name, rc);
+		goto err_del_cdev;
+	}
+
+	dev_dbg(&adapter->dev, "adapter [%s] registered\n", adapter->name);
+
+	/* create pre-declared device nodes */
+	peci_of_register_devices(adapter);
+
+	return 0;
+
+err_del_cdev:
+	cdev_del(&adapter->cdev);
+err_free_idr:
+	mutex_lock(&core_lock);
+	idr_remove(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+	return rc;
+}
+
+static int peci_add_numbered_adapter(struct peci_adapter *adapter)
+{
+	int id;
+
+	mutex_lock(&core_lock);
+	id = idr_alloc(&peci_adapter_idr, adapter,
+		       adapter->nr, adapter->nr + 1, GFP_KERNEL);
+	mutex_unlock(&core_lock);
+	if (WARN(id < 0, "couldn't get idr"))
+		return id == -ENOSPC ? -EBUSY : id;
+
+	return peci_register_adapter(adapter);
+}
+
+int peci_add_adapter(struct peci_adapter *adapter)
+{
+	struct device *dev = &adapter->dev;
+	int id;
+
+	if (dev->of_node) {
+		id = of_alias_get_id(dev->of_node, "peci");
+		if (id >= 0) {
+			adapter->nr = id;
+			return peci_add_numbered_adapter(adapter);
+		}
+	}
+
+	mutex_lock(&core_lock);
+	id = idr_alloc(&peci_adapter_idr, adapter, 0, 0, GFP_KERNEL);
+	mutex_unlock(&core_lock);
+	if (WARN(id < 0, "couldn't get idr"))
+		return id;
+
+	adapter->nr = id;
+
+	return peci_register_adapter(adapter);
+}
+EXPORT_SYMBOL_GPL(peci_add_adapter);
+
+static int peci_unregister_client(struct device *dev, void *dummy)
+{
+	struct peci_client *client = peci_verify_client(dev);
+
+	if (client)
+		peci_unregister_device(client);
+
+	return 0;
+}
+
+void peci_del_adapter(struct peci_adapter *adapter)
+{
+	struct peci_adapter *found;
+
+	/* First make sure that this adapter was ever added */
+	mutex_lock(&core_lock);
+	found = idr_find(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+
+	if (found != adapter)
+		return;
+
+	/**
+	 * Detach any active clients. This can't fail, thus we do not
+	 * check the returned value.
+	 */
+	device_for_each_child(&adapter->dev, NULL, peci_unregister_client);
+
+	/* device name is gone after device_unregister */
+	dev_dbg(&adapter->dev, "adapter [%s] unregistered\n", adapter->name);
+
+	device_unregister(&adapter->dev);
+
+	/* free cdev */
+	cdev_del(&adapter->cdev);
+
+	/* free bus id */
+	mutex_lock(&core_lock);
+	idr_remove(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+}
+EXPORT_SYMBOL_GPL(peci_del_adapter);
+
+/**
+ * A peci_driver is used with one or more peci_client (device) nodes to access
+ * peci clients, on a bus instance associated with some peci_adapter.
+ */
+int peci_register_driver(struct module *owner, struct peci_driver *driver)
+{
+	int rc;
+
+	/* Can't register until after driver model init */
+	if (WARN_ON(!is_registered))
+		return -EAGAIN;
+
+	/* add the driver to the list of peci drivers in the driver core */
+	driver->driver.owner = owner;
+	driver->driver.bus = &peci_bus_type;
+
+	/**
+	 * When registration returns, the driver core
+	 * will have called probe() for all matching-but-unbound devices.
+	 */
+	rc = driver_register(&driver->driver);
+	if (rc)
+		return rc;
+
+	pr_debug("driver [%s] registered\n", driver->driver.name);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(peci_register_driver);
+
+void peci_del_driver(struct peci_driver *driver)
+{
+	driver_unregister(&driver->driver);
+	pr_debug("driver [%s] unregistered\n", driver->driver.name);
+}
+EXPORT_SYMBOL_GPL(peci_del_driver);
+
+static int __init peci_init(void)
+{
+	int ret;
+
+	ret = bus_register(&peci_bus_type);
+	if (ret < 0) {
+		pr_err("peci: Failed to register PECI bus type!\n");
+		return ret;
+	}
+
+	ret = alloc_chrdev_region(&peci_devt, 0, PECI_CDEV_MAX, "peci");
+	if (ret < 0) {
+		pr_err("peci: Failed to allocate chr dev region!\n");
+		bus_unregister(&peci_bus_type);
+		return ret;
+	}
+
+	crc8_populate_msb(peci_crc8_table, PECI_CRC8_POLYNOMIAL);
+
+	if (IS_ENABLED(CONFIG_OF_DYNAMIC))
+		WARN_ON(of_reconfig_notifier_register(&peci_of_notifier));
+
+	is_registered = true;
+
+	return 0;
+}
+
+static void __exit peci_exit(void)
+{
+	if (IS_ENABLED(CONFIG_OF_DYNAMIC))
+		WARN_ON(of_reconfig_notifier_unregister(&peci_of_notifier));
+
+	unregister_chrdev_region(peci_devt, PECI_CDEV_MAX);
+	bus_unregister(&peci_bus_type);
+}
+
+postcore_initcall(peci_init);
+module_exit(peci_exit);
+
+MODULE_AUTHOR("Jason M Biils <jason.m.bills@linux.intel.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI bus core module");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/peci.h b/include/linux/peci.h
new file mode 100644
index 000000000000..8730deb6673c
--- /dev/null
+++ b/include/linux/peci.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Intel Corporation */
+
+#ifndef __LINUX_PECI_H
+#define __LINUX_PECI_H
+
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/peci-ioctl.h>
+#include <linux/rtmutex.h>
+
+#define PECI_BUFFER_SIZE  32
+#define PECI_NAME_SIZE    32
+
+struct peci_xfer_msg {
+	u8	addr;
+	u8	tx_len;
+	u8	rx_len;
+	u8	tx_buf[PECI_BUFFER_SIZE];
+	u8	rx_buf[PECI_BUFFER_SIZE];
+} __attribute__((__packed__));
+
+struct peci_board_info {
+	char			type[PECI_NAME_SIZE];
+	u8			addr;	/* CPU client address */
+	struct device_node	*of_node;
+};
+
+struct peci_adapter {
+	struct module	*owner;
+	struct rt_mutex	bus_lock;
+	struct device	dev;
+	struct cdev	cdev;
+	int		nr;
+	char		name[PECI_NAME_SIZE];
+	int		(*xfer)(struct peci_adapter *adapter,
+				struct peci_xfer_msg *msg);
+	uint		cmd_mask;
+};
+
+static inline struct peci_adapter *to_peci_adapter(void *d)
+{
+	return container_of(d, struct peci_adapter, dev);
+}
+
+static inline void *peci_get_adapdata(const struct peci_adapter *adapter)
+{
+	return dev_get_drvdata(&adapter->dev);
+}
+
+static inline void peci_set_adapdata(struct peci_adapter *adapter, void *data)
+{
+	dev_set_drvdata(&adapter->dev, data);
+}
+
+struct peci_client {
+	struct device		dev;		/* the device structure */
+	struct peci_adapter	*adapter;	/* the adapter we sit on */
+	u8			addr;		/* CPU client address */
+	u8			idx;		/* device index */
+	char			name[PECI_NAME_SIZE];
+};
+
+static inline struct peci_client *to_peci_client(void *d)
+{
+	return container_of(d, struct peci_client, dev);
+}
+
+struct peci_device_id {
+	char		name[PECI_NAME_SIZE];
+	kernel_ulong_t	driver_data;	/* Data private to the driver */
+};
+
+struct peci_driver {
+	int				(*probe)(struct peci_client *client);
+	int				(*remove)(struct peci_client *client);
+	void				(*shutdown)(struct peci_client *client);
+	struct device_driver		driver;
+	const struct peci_device_id	*id_table;
+};
+
+static inline struct peci_driver *to_peci_driver(void *d)
+{
+	return container_of(d, struct peci_driver, driver);
+}
+
+/**
+ * module_peci_driver() - Helper macro for registering a modular PECI driver
+ * @__peci_driver: peci_driver struct
+ *
+ * Helper macro for PECI drivers which do not do anything special in module
+ * init/exit. This eliminates a lot of boilerplate. Each module may only
+ * use this macro once, and calling it replaces module_init() and module_exit()
+ */
+#define module_peci_driver(__peci_driver) \
+	module_driver(__peci_driver, peci_add_driver, peci_del_driver)
+
+/* use a define to avoid include chaining to get THIS_MODULE */
+#define peci_add_driver(driver) peci_register_driver(THIS_MODULE, driver)
+
+int  peci_register_driver(struct module *owner, struct peci_driver *drv);
+void peci_del_driver(struct peci_driver *driver);
+int  peci_add_adapter(struct peci_adapter *adapter);
+void peci_del_adapter(struct peci_adapter *adapter);
+int  peci_command(struct peci_adapter *adpater, enum peci_cmd cmd, void *vmsg);
+
+#endif /* __LINUX_PECI_H */
diff --git a/include/uapi/linux/peci-ioctl.h b/include/uapi/linux/peci-ioctl.h
new file mode 100644
index 000000000000..ec73847b9400
--- /dev/null
+++ b/include/uapi/linux/peci-ioctl.h
@@ -0,0 +1,200 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Intel Corporation */
+
+#ifndef __PECI_IOCTL_H
+#define __PECI_IOCTL_H
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/* Base Address of 48d */
+#define PECI_BASE_ADDR  0x30  /* The PECI client's default address of 0x30 */
+#define PECI_OFFSET_MAX 8     /* Max numver of CPU clients */
+
+/* PCI Access */
+#define MAX_PCI_READ_LEN 24   /* Number of bytes of the PCI Space read */
+
+#define PCI_BUS0_CPU0      0x00
+#define PCI_BUS0_CPU1      0x80
+#define PCI_CPUBUSNO_BUS   0x00
+#define PCI_CPUBUSNO_DEV   0x08
+#define PCI_CPUBUSNO_FUNC  0x02
+#define PCI_CPUBUSNO       0xcc
+#define PCI_CPUBUSNO_1     0xd0
+#define PCI_CPUBUSNO_VALID 0xd4
+
+/* Package Identifier Read Parameter Value */
+#define PKG_ID_CPU_ID               0x0000  /* CPUID Info */
+#define PKG_ID_PLATFORM_ID          0x0001  /* Platform ID */
+#define PKG_ID_UNCORE_ID            0x0002  /* Uncore Device ID */
+#define PKG_ID_MAX_THREAD_ID        0x0003  /* Max Thread ID */
+#define PKG_ID_MICROCODE_REV        0x0004  /* CPU Microcode Update Revision */
+#define PKG_ID_MACHINE_CHECK_STATUS 0x0005  /* Machine Check Status */
+
+/* RdPkgConfig Index */
+#define MBX_INDEX_CPU_ID            0   /* Package Identifier Read */
+#define MBX_INDEX_VR_DEBUG          1   /* VR Debug */
+#define MBX_INDEX_PKG_TEMP_READ     2   /* Package Temperature Read */
+#define MBX_INDEX_ENERGY_COUNTER    3   /* Energy counter */
+#define MBX_INDEX_ENERGY_STATUS     4   /* DDR Energy Status */
+#define MBX_INDEX_WAKE_MODE_BIT     5   /* "Wake on PECI" Mode bit */
+#define MBX_INDEX_EPI               6   /* Efficient Performance Indication */
+#define MBX_INDEX_PKG_RAPL_PERF     8   /* Pkg RAPL Performance Status Read */
+#define MBX_INDEX_PER_CORE_DTS_TEMP 9   /* Per Core DTS Temperature Read */
+#define MBX_INDEX_DTS_MARGIN        10  /* DTS thermal margin */
+#define MBX_INDEX_SKT_PWR_THRTL_DUR 11  /* Socket Power Throttled Duration */
+#define MBX_INDEX_CFG_TDP_CONTROL   12  /* TDP Config Control */
+#define MBX_INDEX_CFG_TDP_LEVELS    13  /* TDP Config Levels */
+#define MBX_INDEX_DDR_DIMM_TEMP     14  /* DDR DIMM Temperature */
+#define MBX_INDEX_CFG_ICCMAX        15  /* Configurable ICCMAX */
+#define MBX_INDEX_TEMP_TARGET       16  /* Temperature Target Read */
+#define MBX_INDEX_CURR_CFG_LIMIT    17  /* Current Config Limit */
+#define MBX_INDEX_DIMM_TEMP_READ    20  /* Package Thermal Status Read */
+#define MBX_INDEX_DRAM_IMC_TMP_READ 22  /* DRAM IMC Temperature Read */
+#define MBX_INDEX_DDR_CH_THERM_STAT 23  /* DDR Channel Thermal Status */
+#define MBX_INDEX_PKG_POWER_LIMIT1  26  /* Package Power Limit1 */
+#define MBX_INDEX_PKG_POWER_LIMIT2  27  /* Package Power Limit2 */
+#define MBX_INDEX_TDP               28  /* Thermal design power minimum */
+#define MBX_INDEX_TDP_HIGH          29  /* Thermal design power maximum */
+#define MBX_INDEX_TDP_UNITS         30  /* Units for power/energy registers */
+#define MBX_INDEX_RUN_TIME          31  /* Accumulated Run Time */
+#define MBX_INDEX_CONSTRAINED_TIME  32  /* Thermally Constrained Time Read */
+#define MBX_INDEX_TURBO_RATIO       33  /* Turbo Activation Ratio */
+#define MBX_INDEX_DDR_RAPL_PL1      34  /* DDR RAPL PL1 */
+#define MBX_INDEX_DDR_PWR_INFO_HIGH 35  /* DRAM Power Info Read (high) */
+#define MBX_INDEX_DDR_PWR_INFO_LOW  36  /* DRAM Power Info Read (low) */
+#define MBX_INDEX_DDR_RAPL_PL2      37  /* DDR RAPL PL2 */
+#define MBX_INDEX_DDR_RAPL_STATUS   38  /* DDR RAPL Performance Status */
+#define MBX_INDEX_DDR_HOT_ABSOLUTE  43  /* DDR Hottest Dimm Absolute Temp */
+#define MBX_INDEX_DDR_HOT_RELATIVE  44  /* DDR Hottest Dimm Relative Temp */
+#define MBX_INDEX_DDR_THROTTLE_TIME 45  /* DDR Throttle Time */
+#define MBX_INDEX_DDR_THERM_STATUS  46  /* DDR Thermal Status */
+#define MBX_INDEX_TIME_AVG_TEMP     47  /* Package time-averaged temperature */
+#define MBX_INDEX_TURBO_RATIO_LIMIT 49  /* Turbo Ratio Limit Read */
+#define MBX_INDEX_HWP_AUTO_OOB      53  /* HWP Autonomous Out-of-band */
+#define MBX_INDEX_DDR_WARM_BUDGET   55  /* DDR Warm Power Budget */
+#define MBX_INDEX_DDR_HOT_BUDGET    56  /* DDR Hot Power Budget */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM3 57  /* Package/Psys Power Limit3 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM1 58  /* Package/Psys Power Limit1 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM2 59  /* Package/Psys Power Limit2 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM4 60  /* Package/Psys Power Limit4 */
+#define MBX_INDEX_PERF_LIMIT_REASON 65  /* Performance Limit Reasons */
+
+/* WrPkgConfig Index */
+#define MBX_INDEX_DIMM_AMBIENT 19
+#define MBX_INDEX_DIMM_TEMP    24
+
+enum peci_cmd {
+	PECI_CMD_XFER = 0,
+	PECI_CMD_PING,
+	PECI_CMD_GET_DIB,
+	PECI_CMD_GET_TEMP,
+	PECI_CMD_RD_PKG_CFG,
+	PECI_CMD_WR_PKG_CFG,
+	PECI_CMD_RD_IA_MSR,
+	PECI_CMD_WR_IA_MSR,
+	PECI_CMD_RD_PCI_CFG,
+	PECI_CMD_WR_PCI_CFG,
+	PECI_CMD_RD_PCI_CFG_LOCAL,
+	PECI_CMD_WR_PCI_CFG_LOCAL,
+	PECI_CMD_MAX
+};
+
+struct peci_ping_msg {
+	__u8 addr;
+} __attribute__((__packed__));
+
+struct peci_get_dib_msg {
+	__u8  addr;
+	__u32 dib;
+} __attribute__((__packed__));
+
+struct peci_get_temp_msg {
+	__u8  addr;
+	__s16 temp_raw;
+} __attribute__((__packed__));
+
+struct peci_rd_pkg_cfg_msg {
+	__u8  addr;
+	__u8  index;
+	__u16 param;
+	__u8  rx_len;
+	__u8  pkg_config[4];
+} __attribute__((__packed__));
+
+struct peci_wr_pkg_cfg_msg {
+	__u8  addr;
+	__u8  index;
+	__u16 param;
+	__u8  tx_len;
+	__u32 value;
+} __attribute__((__packed__));
+
+struct peci_rd_ia_msr_msg {
+	__u8  addr;
+	__u8  thread_id;
+	__u16 address;
+	__u64 value;
+} __attribute__((__packed__));
+
+struct peci_rd_pci_cfg_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  pci_config[4];
+} __attribute__((__packed__));
+
+struct peci_rd_pci_cfg_local_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  rx_len;
+	__u8  pci_config[4];
+} __attribute__((__packed__));
+
+struct peci_wr_pci_cfg_local_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  tx_len;
+	__u32 value;
+} __attribute__((__packed__));
+
+#define PECI_IOC_BASE  0xb6
+
+#define PECI_IOC_PING \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_PING, struct peci_ping_msg)
+
+#define PECI_IOC_GET_DIB \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_GET_DIB, struct peci_get_dib_msg)
+
+#define PECI_IOC_GET_TEMP \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_GET_TEMP, struct peci_get_temp_msg)
+
+#define PECI_IOC_RD_PKG_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PKG_CFG, struct peci_rd_pkg_cfg_msg)
+
+#define PECI_IOC_WR_PKG_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_WR_PKG_CFG, struct peci_wr_pkg_cfg_msg)
+
+#define PECI_IOC_RD_IA_MSR \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_IA_MSR, struct peci_rd_ia_msr_msg)
+
+#define PECI_IOC_RD_PCI_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PCI_CFG, struct peci_rd_pci_cfg_msg)
+
+#define PECI_IOC_RD_PCI_CFG_LOCAL \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PCI_CFG_LOCAL, \
+	      struct peci_rd_pci_cfg_local_msg)
+
+#define PECI_IOC_WR_PCI_CFG_LOCAL \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_WR_PCI_CFG_LOCAL, \
+	      struct peci_wr_pci_cfg_local_msg)
+
+#endif /* __PECI_IOCTL_H */
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds PECI bus/adapter node of AST24xx/AST25xx into
aspeed-g4 and aspeed-g5.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 arch/arm/boot/dts/aspeed-g4.dtsi | 25 +++++++++++++++++++++++++
 arch/arm/boot/dts/aspeed-g5.dtsi | 25 +++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index 518d2bc7c7fc..f7992eee4d1f 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -270,6 +271,13 @@
 				};
 			};
 
+			peci: peci@1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -313,6 +321,23 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus@0 {
+		compatible = "aspeed,ast2400-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		clock-frequency = <24000000>;
+		msg-timing-nego = <1>;
+		addr-timing-nego = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller@0 {
 		#interrupt-cells = <1>;
diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index f9917717dd08..278791dba8a0 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -320,6 +321,13 @@
 				};
 			};
 
+			peci: peci@1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -363,6 +371,23 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus@0 {
+		compatible = "aspeed,ast2500-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		clock-frequency = <24000000>;
+		msg-timing-nego = <1>;
+		addr-timing-nego = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller@0 {
 		#interrupt-cells = <1>;
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds a dt-bindings document of PECI adapter driver for Aspeed
AST24xx/25xx SoCs.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
 1 file changed, 60 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
new file mode 100644
index 000000000000..4598bb8c20fa
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
@@ -0,0 +1,60 @@
+Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
+
+Required properties:
+- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
+		      - aspeed,ast2400-peci: Aspeed AST2400 family PECI
+					     controller
+		      - aspeed,ast2500-peci: Aspeed AST2500 family PECI
+					     controller
+- reg               : Should contain PECI controller registers location and
+		      length.
+- #address-cells    : Should be <1>.
+- #size-cells       : Should be <0>.
+- interrupts        : Should contain PECI controller interrupt.
+- clocks            : Should contain clock source for PECI controller.
+		      Should reference clkin.
+- clock_frequency   : Should contain the operation frequency of PECI controller
+		      in units of Hz.
+		      187500 ~ 24000000
+
+Optional properties:
+- msg-timing-nego   : Message timing negotiation period. This value will
+		      determine the period of message timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- addr-timing-nego  : Address timing negotiation period. This value will
+		      determine the period of address timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- rd-sampling-point : Read sampling point selection. The whole period of a bit
+		      time will be divided into 16 time frames. This value will
+		      determine the time frame in which the controller will
+		      sample PECI signal for data read back. Usually in the
+		      middle of a bit time is the best.
+		      0 ~ 15 (default: 8)
+- cmd_timeout_ms    : Command timeout in units of ms.
+		      1 ~ 60000 (default: 1000)
+
+Example:
+	peci: peci@1e78b000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x1e78b000 0x60>;
+
+		peci0: peci-bus@0 {
+			compatible = "aspeed,ast2500-peci";
+			reg = <0x0 0x60>;
+			#address-cells = <1>;
+			#size-cells = <0>;
+			interrupts = <15>;
+			clocks = <&clk_clkin>;
+			clock-frequency = <24000000>;
+			msg-timing-nego = <1>;
+			addr-timing-nego = <1>;
+			rd-sampling-point = <8>;
+			cmd-timeout-ms = <1000>;
+		};
+	};
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 08/10] Documentation: hwmon: Add documents for PECI hwmon client drivers
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds hwmon documents for PECI cputemp and dimmtemp drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 Documentation/hwmon/peci-cputemp  | 88 +++++++++++++++++++++++++++++++++++++++
 Documentation/hwmon/peci-dimmtemp | 50 ++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/hwmon/peci-cputemp
 create mode 100644 Documentation/hwmon/peci-dimmtemp

diff --git a/Documentation/hwmon/peci-cputemp b/Documentation/hwmon/peci-cputemp
new file mode 100644
index 000000000000..cdd5ea49a4a2
--- /dev/null
+++ b/Documentation/hwmon/peci-cputemp
@@ -0,0 +1,88 @@
+Kernel driver peci-cputemp
+==========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+	Addresses scanned: PECI client address 0x30 - 0x37
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author:
+	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of the CPU package and CPU cores that are
+accessible using the PECI Client Command Suite via the processor PECI client.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+sysfs attributes
+----------------
+
+temp1_label		"Die"
+temp1_input		Provides current die temperature of the CPU package.
+temp1_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp1_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp1_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp2_label		"DTS margin"
+temp2_input		Provides current DTS thermal margin to Tcontrol of the
+			CPU package. Value 0 means it reaches to Tcontrol
+			temperature. Sub-zero value means the die temperature
+			goes across Tconrtol to Tjmax.
+temp2_min		Provides the minimum DTS thermal margin to Tcontrol of
+			the CPU package.
+temp2_lcrit		Provides the value when the CPU package temperature
+			reaches to Tjmax.
+
+temp3_label		"Tcontrol"
+temp3_input		Provides current Tcontrol temperature of the CPU
+			package which is also known as Fan Temperature target.
+			Indicates the relative value from thermal monitor trip
+			temperature at which fans should be engaged.
+temp3_crit		Provides Tcontrol critical value of the CPU package
+			which is same to Tjmax.
+
+temp4_label		"Tthrottle"
+temp4_input		Provides current Tthrottle temperature of the CPU
+			package. Used for throttling temperature. If this value
+			is allowed and lower than Tjmax - the throttle will
+			occur and reported at lower than Tjmax.
+
+temp5_label		"Tjmax"
+temp5_input		Provides the maximum junction temperature, Tjmax of the
+			CPU package.
+
+temp[6-*]_label		Provides string "Core X", where X is resolved core
+			number.
+temp[6-*]_input		Provides current temperature of each core.
+temp[6-*]_max		Provides thermal control temperature of the core.
+temp[6-*]_crit		Provides shutdown temperature of the core.
+temp[6-*]_crit_hyst	Provides the hysteresis value from Tcontrol to Tjmax of
+			the core.
diff --git a/Documentation/hwmon/peci-dimmtemp b/Documentation/hwmon/peci-dimmtemp
new file mode 100644
index 000000000000..c54f2526188c
--- /dev/null
+++ b/Documentation/hwmon/peci-dimmtemp
@@ -0,0 +1,50 @@
+Kernel driver peci-dimmtemp
+===========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+	Addresses scanned: PECI client address 0x30 - 0x37
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author:
+	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of DIMM components that are accessible
+using the PECI Client Command Suite via the processor PECI client.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+sysfs attributes
+----------------
+
+temp[N]_label		Provides string "DIMM CI", where C is DIMM channel and
+			I is DIMM index of the populated DIMM.
+temp[N]_input		Provides current temperature of the populated DIMM.
+
+Note:
+	DIMM temperature attributes will appear when the client CPU's BIOS
+	completes memory training and testing.
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds dt-bindings documents for PECI cputemp and dimmtemp client
drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/hwmon/peci-cputemp.txt     | 24 +++++++++++++++++++++
 .../devicetree/bindings/hwmon/peci-dimmtemp.txt    | 25 ++++++++++++++++++++++
 2 files changed, 49 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt

diff --git a/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
new file mode 100644
index 000000000000..d5530ef9cfd2
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
@@ -0,0 +1,24 @@
+Bindings for Intel PECI (Platform Environment Control Interface) cputemp driver.
+
+Required properties:
+- compatible : Should be "intel,peci-cputemp".
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		peci-cputemp@cpu0 {
+			compatible = "intel,peci-cputemp";
+			reg = <0x30>;
+		};
+
+		peci-cputemp@cpu1 {
+			compatible = "intel,peci-cputemp";
+			reg = <0x31>;
+		};
+	};
diff --git a/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
new file mode 100644
index 000000000000..56e5deb61e5c
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
@@ -0,0 +1,25 @@
+Bindings for Intel PECI (Platform Environment Control Interface) dimmtemp
+driver.
+
+Required properties:
+- compatible : Should be "intel,peci-dimmtemp".
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		peci-dimmtemp@cpu0 {
+			compatible = "intel,peci-dimmtemp";
+			reg = <0x30>;
+		};
+
+		peci-dimmtemp@cpu1 {
+			compatible = "intel,peci-dimmtemp";
+			reg = <0x31>;
+		};
+	};
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds PECI adapter driver implementation for Aspeed
AST24xx/AST25xx.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/peci/Kconfig       |  28 +++
 drivers/peci/Makefile      |   3 +
 drivers/peci/peci-aspeed.c | 504 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 535 insertions(+)
 create mode 100644 drivers/peci/peci-aspeed.c

diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 1fbc13f9e6c2..0e33420365de 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -14,4 +14,32 @@ config PECI
 	  processors and chipset components to external monitoring or control
 	  devices.
 
+	  If you want PECI support, you should say Y here and also to the
+	  specific driver for your bus adapter(s) below.
+
+if PECI
+
+#
+# PECI hardware bus configuration
+#
+
+menu "PECI Hardware Bus support"
+
+config PECI_ASPEED
+	tristate "Aspeed AST24xx/AST25xx PECI support"
+	select REGMAP_MMIO
+	depends on OF
+	depends on ARCH_ASPEED || COMPILE_TEST
+	help
+	  Say Y here if you want support for the Platform Environment Control
+	  Interface (PECI) bus adapter driver on the Aspeed AST24XX and AST25XX
+	  SoCs.
+
+	  This support is also available as a module.  If so, the module
+	  will be called peci-aspeed.
+
+endmenu
+
+endif # PECI
+
 endmenu
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 9e8615e0d3ff..886285e69765 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -4,3 +4,6 @@
 
 # Core functionality
 obj-$(CONFIG_PECI)		+= peci-core.o
+
+# Hardware specific bus drivers
+obj-$(CONFIG_PECI_ASPEED)	+= peci-aspeed.o
diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
new file mode 100644
index 000000000000..be2a1f327eb1
--- /dev/null
+++ b/drivers/peci/peci-aspeed.c
@@ -0,0 +1,504 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+
+#define DUMP_DEBUG 0
+
+/* Aspeed PECI Registers */
+#define AST_PECI_CTRL     0x00
+#define AST_PECI_TIMING   0x04
+#define AST_PECI_CMD      0x08
+#define AST_PECI_CMD_CTRL 0x0c
+#define AST_PECI_EXP_FCS  0x10
+#define AST_PECI_CAP_FCS  0x14
+#define AST_PECI_INT_CTRL 0x18
+#define AST_PECI_INT_STS  0x1c
+#define AST_PECI_W_DATA0  0x20
+#define AST_PECI_W_DATA1  0x24
+#define AST_PECI_W_DATA2  0x28
+#define AST_PECI_W_DATA3  0x2c
+#define AST_PECI_R_DATA0  0x30
+#define AST_PECI_R_DATA1  0x34
+#define AST_PECI_R_DATA2  0x38
+#define AST_PECI_R_DATA3  0x3c
+#define AST_PECI_W_DATA4  0x40
+#define AST_PECI_W_DATA5  0x44
+#define AST_PECI_W_DATA6  0x48
+#define AST_PECI_W_DATA7  0x4c
+#define AST_PECI_R_DATA4  0x50
+#define AST_PECI_R_DATA5  0x54
+#define AST_PECI_R_DATA6  0x58
+#define AST_PECI_R_DATA7  0x5c
+
+/* AST_PECI_CTRL - 0x00 : Control Register */
+#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
+#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
+#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
+#define PECI_CTRL_READ_MODE_MASK    GENMASK(13, 12)
+#define PECI_CTRL_READ_MODE(x)      (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
+#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
+#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
+#define PECI_CTRL_READ_MODE_DBG     BIT(13)
+#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
+#define PECI_CTRL_CLK_SOURCE(x)     (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
+#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
+#define PECI_CTRL_CLK_DIV_MASK      GENMASK(10, 8)
+#define PECI_CTRL_CLK_DIV(x)        (((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
+#define PECI_CTRL_CLK_DIV_GET(x)    (((x) & PECI_CTRL_CLK_DIV_MASK) >> 8)
+#define PECI_CTRL_INVERT_OUT        BIT(7)
+#define PECI_CTRL_INVERT_IN         BIT(6)
+#define PECI_CTRL_BUS_CONTENT_EN    BIT(5)
+#define PECI_CTRL_PECI_EN           BIT(4)
+#define PECI_CTRL_PECI_CLK_EN       BIT(0)
+
+/* AST_PECI_TIMING - 0x04 : Timing Negotiation Register */
+#define PECI_TIMING_MESSAGE_MASK   GENMASK(15, 8)
+#define PECI_TIMING_MESSAGE(x)     (((x) << 8) & PECI_TIMING_MESSAGE_MASK)
+#define PECI_TIMING_MESSAGE_GET(x) (((x) & PECI_TIMING_MESSAGE_MASK) >> 8)
+#define PECI_TIMING_ADDRESS_MASK   GENMASK(7, 0)
+#define PECI_TIMING_ADDRESS(x)     ((x) & PECI_TIMING_ADDRESS_MASK)
+#define PECI_TIMING_ADDRESS_GET(x) ((x) & PECI_TIMING_ADDRESS_MASK)
+
+/* AST_PECI_CMD - 0x08 : Command Register */
+#define PECI_CMD_PIN_MON    BIT(31)
+#define PECI_CMD_STS_MASK   GENMASK(27, 24)
+#define PECI_CMD_STS_GET(x) (((x) & PECI_CMD_STS_MASK) >> 24)
+#define PECI_CMD_FIRE       BIT(0)
+
+/* AST_PECI_LEN - 0x0C : Read/Write Length Register */
+#define PECI_AW_FCS_EN       BIT(31)
+#define PECI_READ_LEN_MASK   GENMASK(23, 16)
+#define PECI_READ_LEN(x)     (((x) << 16) & PECI_READ_LEN_MASK)
+#define PECI_WRITE_LEN_MASK  GENMASK(15, 8)
+#define PECI_WRITE_LEN(x)    (((x) << 8) & PECI_WRITE_LEN_MASK)
+#define PECI_TAGET_ADDR_MASK GENMASK(7, 0)
+#define PECI_TAGET_ADDR(x)   ((x) & PECI_TAGET_ADDR_MASK)
+
+/* AST_PECI_EXP_FCS - 0x10 : Expected FCS Data Register */
+#define PECI_EXPECT_READ_FCS_MASK      GENMASK(23, 16)
+#define PECI_EXPECT_READ_FCS_GET(x)    (((x) & PECI_EXPECT_READ_FCS_MASK) >> 16)
+#define PECI_EXPECT_AW_FCS_AUTO_MASK   GENMASK(15, 8)
+#define PECI_EXPECT_AW_FCS_AUTO_GET(x) (((x) & PECI_EXPECT_AW_FCS_AUTO_MASK) \
+					>> 8)
+#define PECI_EXPECT_WRITE_FCS_MASK     GENMASK(7, 0)
+#define PECI_EXPECT_WRITE_FCS_GET(x)   ((x) & PECI_EXPECT_WRITE_FCS_MASK)
+
+/* AST_PECI_CAP_FCS - 0x14 : Captured FCS Data Register */
+#define PECI_CAPTURE_READ_FCS_MASK    GENMASK(23, 16)
+#define PECI_CAPTURE_READ_FCS_GET(x)  (((x) & PECI_CAPTURE_READ_FCS_MASK) >> 16)
+#define PECI_CAPTURE_WRITE_FCS_MASK   GENMASK(7, 0)
+#define PECI_CAPTURE_WRITE_FCS_GET(x) ((x) & PECI_CAPTURE_WRITE_FCS_MASK)
+
+/* AST_PECI_INT_CTRL/STS - 0x18/0x1c : Interrupt Register */
+#define PECI_INT_TIMING_RESULT_MASK GENMASK(31, 30)
+#define PECI_INT_TIMEOUT            BIT(4)
+#define PECI_INT_CONNECT            BIT(3)
+#define PECI_INT_W_FCS_BAD          BIT(2)
+#define PECI_INT_W_FCS_ABORT        BIT(1)
+#define PECI_INT_CMD_DONE           BIT(0)
+
+struct aspeed_peci {
+	struct peci_adapter	adaper;
+	struct device		*dev;
+	struct regmap		*regmap;
+	int			irq;
+	struct completion	xfer_complete;
+	u32			status;
+	u32			cmd_timeout_ms;
+};
+
+#define PECI_INT_MASK  (PECI_INT_TIMEOUT | PECI_INT_CONNECT | \
+			PECI_INT_W_FCS_BAD | PECI_INT_W_FCS_ABORT | \
+			PECI_INT_CMD_DONE)
+
+#define PECI_IDLE_CHECK_TIMEOUT_MS      50
+#define PECI_IDLE_CHECK_INTERVAL_MS     10
+
+#define PECI_RD_SAMPLING_POINT_DEFAULT  8
+#define PECI_RD_SAMPLING_POINT_MAX      15
+#define PECI_CLK_DIV_DEFAULT            0
+#define PECI_CLK_DIV_MAX                7
+#define PECI_MSG_TIMING_NEGO_DEFAULT    1
+#define PECI_MSG_TIMING_NEGO_MAX        255
+#define PECI_ADDR_TIMING_NEGO_DEFAULT   1
+#define PECI_ADDR_TIMING_NEGO_MAX       255
+#define PECI_CMD_TIMEOUT_MS_DEFAULT     1000
+#define PECI_CMD_TIMEOUT_MS_MAX         60000
+
+static int aspeed_peci_xfer_native(struct aspeed_peci *priv,
+				   struct peci_xfer_msg *msg)
+{
+	long err, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
+	u32 peci_head, peci_state, rx_data, cmd_sts;
+	ktime_t start, end;
+	s64 elapsed_ms;
+	int i, rc = 0;
+	uint reg;
+
+	start = ktime_get();
+
+	/* Check command sts and bus idle state */
+	while (!regmap_read(priv->regmap, AST_PECI_CMD, &cmd_sts) &&
+	       (cmd_sts & (PECI_CMD_STS_MASK | PECI_CMD_PIN_MON))) {
+		end = ktime_get();
+		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+		if (elapsed_ms >= PECI_IDLE_CHECK_TIMEOUT_MS) {
+			dev_dbg(priv->dev, "Timeout waiting for idle state!\n");
+			return -ETIMEDOUT;
+		}
+
+		usleep_range(PECI_IDLE_CHECK_INTERVAL_MS * 1000,
+			     (PECI_IDLE_CHECK_INTERVAL_MS * 1000) + 1000);
+	};
+
+	reinit_completion(&priv->xfer_complete);
+
+	peci_head = PECI_TAGET_ADDR(msg->addr) |
+				    PECI_WRITE_LEN(msg->tx_len) |
+				    PECI_READ_LEN(msg->rx_len);
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD_CTRL, peci_head);
+	if (rc)
+		return rc;
+
+	for (i = 0; i < msg->tx_len; i += 4) {
+		reg = i < 16 ? AST_PECI_W_DATA0 + i % 16 :
+			       AST_PECI_W_DATA4 + i % 16;
+		rc = regmap_write(priv->regmap, reg,
+				  (msg->tx_buf[i + 3] << 24) |
+				  (msg->tx_buf[i + 2] << 16) |
+				  (msg->tx_buf[i + 1] << 8) |
+				  msg->tx_buf[i + 0]);
+		if (rc)
+			return rc;
+	}
+
+	dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
+#if DUMP_DEBUG
+	print_hex_dump(KERN_DEBUG, "TX : ", DUMP_PREFIX_NONE, 16, 1,
+		       msg->tx_buf, msg->tx_len, true);
+#endif
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD, PECI_CMD_FIRE);
+	if (rc)
+		return rc;
+
+	err = wait_for_completion_interruptible_timeout(&priv->xfer_complete,
+							timeout);
+
+	dev_dbg(priv->dev, "INT_STS : 0x%08x\n", priv->status);
+	if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
+		dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+			PECI_CMD_STS_GET(peci_state));
+	else
+		dev_dbg(priv->dev, "PECI_STATE : read error\n");
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD, 0);
+	if (rc)
+		return rc;
+
+	if (err <= 0 || !(priv->status & PECI_INT_CMD_DONE)) {
+		if (err < 0) { /* -ERESTARTSYS */
+			return (int)err;
+		} else if (err == 0) {
+			dev_dbg(priv->dev, "Timeout waiting for a response!\n");
+			return -ETIMEDOUT;
+		}
+
+		dev_dbg(priv->dev, "No valid response!\n");
+		return -EIO;
+	}
+
+	for (i = 0; i < msg->rx_len; i++) {
+		u8 byte_offset = i % 4;
+
+		if (byte_offset == 0) {
+			reg = i < 16 ? AST_PECI_R_DATA0 + i % 16 :
+				       AST_PECI_R_DATA4 + i % 16;
+			rc = regmap_read(priv->regmap, reg, &rx_data);
+			if (rc)
+				return rc;
+		}
+
+		msg->rx_buf[i] = (u8)(rx_data >> (byte_offset << 3));
+	}
+
+#if DUMP_DEBUG
+	print_hex_dump(KERN_DEBUG, "RX : ", DUMP_PREFIX_NONE, 16, 1,
+		       msg->rx_buf, msg->rx_len, true);
+#endif
+	if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
+		dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+			PECI_CMD_STS_GET(peci_state));
+	else
+		dev_dbg(priv->dev, "PECI_STATE : read error\n");
+	dev_dbg(priv->dev, "------------------------\n");
+
+	return rc;
+}
+
+static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
+{
+	struct aspeed_peci *priv = arg;
+	u32 status_ack = 0;
+
+	if (regmap_read(priv->regmap, AST_PECI_INT_STS, &priv->status))
+		return IRQ_NONE;
+
+	/* Be noted that multiple interrupt bits can be set at the same time */
+	if (priv->status & PECI_INT_TIMEOUT) {
+		dev_dbg(priv->dev, "PECI_INT_TIMEOUT\n");
+		status_ack |= PECI_INT_TIMEOUT;
+	}
+
+	if (priv->status & PECI_INT_CONNECT) {
+		dev_dbg(priv->dev, "PECI_INT_CONNECT\n");
+		status_ack |= PECI_INT_CONNECT;
+	}
+
+	if (priv->status & PECI_INT_W_FCS_BAD) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_BAD\n");
+		status_ack |= PECI_INT_W_FCS_BAD;
+	}
+
+	if (priv->status & PECI_INT_W_FCS_ABORT) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_ABORT\n");
+		status_ack |= PECI_INT_W_FCS_ABORT;
+	}
+
+	/**
+	 * All commands should be ended up with a PECI_INT_CMD_DONE bit set
+	 * even in an error case.
+	 */
+	if (priv->status & PECI_INT_CMD_DONE) {
+		dev_dbg(priv->dev, "PECI_INT_CMD_DONE\n");
+		status_ack |= PECI_INT_CMD_DONE;
+		complete(&priv->xfer_complete);
+	}
+
+	if (regmap_write(priv->regmap, AST_PECI_INT_STS, status_ack))
+		return IRQ_NONE;
+
+	return IRQ_HANDLED;
+}
+
+static int aspeed_peci_init_ctrl(struct aspeed_peci *priv)
+{
+	u32 msg_timing_nego, addr_timing_nego, rd_sampling_point;
+	u32 clk_freq, clk_divisor, clk_div_val = 0;
+	struct clk *clkin;
+	int ret;
+
+	clkin = devm_clk_get(priv->dev, NULL);
+	if (IS_ERR(clkin)) {
+		dev_err(priv->dev, "Failed to get clk source.\n");
+		return PTR_ERR(clkin);
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "clock-frequency",
+				   &clk_freq);
+	if (ret < 0) {
+		dev_err(priv->dev,
+			"Could not read clock-frequency property.\n");
+		return ret;
+	}
+
+	clk_divisor = clk_get_rate(clkin) / clk_freq;
+	devm_clk_put(priv->dev, clkin);
+
+	while ((clk_divisor >> 1) && (clk_div_val < PECI_CLK_DIV_MAX))
+		clk_div_val++;
+
+	ret = of_property_read_u32(priv->dev->of_node, "msg-timing-nego",
+				   &msg_timing_nego);
+	if (ret || msg_timing_nego > PECI_MSG_TIMING_NEGO_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid msg-timing-nego : %u, Use default : %u\n",
+			 msg_timing_nego, PECI_MSG_TIMING_NEGO_DEFAULT);
+		msg_timing_nego = PECI_MSG_TIMING_NEGO_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "addr-timing-nego",
+				   &addr_timing_nego);
+	if (ret || addr_timing_nego > PECI_ADDR_TIMING_NEGO_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid addr-timing-nego : %u, Use default : %u\n",
+			 addr_timing_nego, PECI_ADDR_TIMING_NEGO_DEFAULT);
+		addr_timing_nego = PECI_ADDR_TIMING_NEGO_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "rd-sampling-point",
+				   &rd_sampling_point);
+	if (ret || rd_sampling_point > PECI_RD_SAMPLING_POINT_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid rd-sampling-point : %u. Use default : %u\n",
+			 rd_sampling_point,
+			 PECI_RD_SAMPLING_POINT_DEFAULT);
+		rd_sampling_point = PECI_RD_SAMPLING_POINT_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "cmd-timeout-ms",
+				   &priv->cmd_timeout_ms);
+	if (ret || priv->cmd_timeout_ms > PECI_CMD_TIMEOUT_MS_MAX ||
+	    priv->cmd_timeout_ms == 0) {
+		dev_warn(priv->dev,
+			 "Invalid cmd-timeout-ms : %u. Use default : %u\n",
+			 priv->cmd_timeout_ms,
+			 PECI_CMD_TIMEOUT_MS_DEFAULT);
+		priv->cmd_timeout_ms = PECI_CMD_TIMEOUT_MS_DEFAULT;
+	}
+
+	ret = regmap_write(priv->regmap, AST_PECI_CTRL,
+			   PECI_CTRL_CLK_DIV(PECI_CLK_DIV_DEFAULT) |
+			   PECI_CTRL_PECI_CLK_EN);
+	if (ret)
+		return ret;
+
+	usleep_range(1000, 5000);
+
+	/**
+	 * Timing negotiation period setting.
+	 * The unit of the programmed value is 4 times of PECI clock period.
+	 */
+	ret = regmap_write(priv->regmap, AST_PECI_TIMING,
+			   PECI_TIMING_MESSAGE(msg_timing_nego) |
+			   PECI_TIMING_ADDRESS(addr_timing_nego));
+	if (ret)
+		return ret;
+
+	/* Clear interrupts */
+	ret = regmap_write(priv->regmap, AST_PECI_INT_STS, PECI_INT_MASK);
+	if (ret)
+		return ret;
+
+	/* Enable interrupts */
+	ret = regmap_write(priv->regmap, AST_PECI_INT_CTRL, PECI_INT_MASK);
+	if (ret)
+		return ret;
+
+	/* Read sampling point and clock speed setting */
+	ret = regmap_write(priv->regmap, AST_PECI_CTRL,
+			   PECI_CTRL_SAMPLING(rd_sampling_point) |
+			   PECI_CTRL_CLK_DIV(clk_div_val) |
+			   PECI_CTRL_PECI_EN | PECI_CTRL_PECI_CLK_EN);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static const struct regmap_config aspeed_peci_regmap_config = {
+	.reg_bits = 32,
+	.val_bits = 32,
+	.reg_stride = 4,
+	.max_register = AST_PECI_R_DATA7,
+	.val_format_endian = REGMAP_ENDIAN_LITTLE,
+	.fast_io = true,
+};
+
+static int aspeed_peci_xfer(struct peci_adapter *adaper,
+			    struct peci_xfer_msg *msg)
+{
+	struct aspeed_peci *priv = peci_get_adapdata(adaper);
+
+	return aspeed_peci_xfer_native(priv, msg);
+}
+
+static int aspeed_peci_probe(struct platform_device *pdev)
+{
+	struct aspeed_peci *priv;
+	struct resource *res;
+	void __iomem *base;
+	int ret = 0;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(&pdev->dev, priv);
+	priv->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+
+	priv->regmap = devm_regmap_init_mmio(&pdev->dev, base,
+					     &aspeed_peci_regmap_config);
+	if (IS_ERR(priv->regmap))
+		return PTR_ERR(priv->regmap);
+
+	priv->irq = platform_get_irq(pdev, 0);
+	if (!priv->irq)
+		return -ENODEV;
+
+	ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
+			       IRQF_SHARED,
+			       "peci-aspeed-irq",
+			       priv);
+	if (ret < 0)
+		return ret;
+
+	init_completion(&priv->xfer_complete);
+
+	priv->adaper.dev.parent = priv->dev;
+	priv->adaper.dev.of_node = of_node_get(dev_of_node(priv->dev));
+	strlcpy(priv->adaper.name, pdev->name, sizeof(priv->adaper.name));
+	priv->adaper.xfer = aspeed_peci_xfer;
+	peci_set_adapdata(&priv->adaper, priv);
+
+	ret = aspeed_peci_init_ctrl(priv);
+	if (ret < 0)
+		return ret;
+
+	ret = peci_add_adapter(&priv->adaper);
+	if (ret < 0)
+		return ret;
+
+	dev_info(&pdev->dev, "peci bus %d registered, irq %d\n",
+		 priv->adaper.nr, priv->irq);
+
+	return 0;
+}
+
+static int aspeed_peci_remove(struct platform_device *pdev)
+{
+	struct aspeed_peci *priv = dev_get_drvdata(&pdev->dev);
+
+	peci_del_adapter(&priv->adaper);
+	of_node_put(priv->adaper.dev.of_node);
+
+	return 0;
+}
+
+static const struct of_device_id aspeed_peci_of_table[] = {
+	{ .compatible = "aspeed,ast2400-peci", },
+	{ .compatible = "aspeed,ast2500-peci", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
+
+static struct platform_driver aspeed_peci_driver = {
+	.probe  = aspeed_peci_probe,
+	.remove = aspeed_peci_remove,
+	.driver = {
+		.name           = "peci-aspeed",
+		.of_match_table = of_match_ptr(aspeed_peci_of_table),
+	},
+};
+module_platform_driver(aspeed_peci_driver);
+
+MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("Aspeed PECI driver");
+MODULE_LICENSE("GPL v2");
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/hwmon/Kconfig         |  28 ++
 drivers/hwmon/Makefile        |   2 +
 drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
 drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
 4 files changed, 1245 insertions(+)
 create mode 100644 drivers/hwmon/peci-cputemp.c
 create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
 	  This driver can also be built as a module.  If so, the module
 	  will be called nct7904.
 
+config SENSORS_PECI_CPUTEMP
+	tristate "PECI CPU temperature monitoring support"
+	depends on OF
+	depends on PECI
+	help
+	  If you say yes here you get support for the generic Intel PECI
+	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+	  readings of the CPU package and CPU cores that are accessible using
+	  the PECI Client Command Suite via the processor PECI client.
+	  Check Documentation/hwmon/peci-cputemp for details.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+	tristate "PECI DIMM temperature monitoring support"
+	depends on OF
+	depends on PECI
+	help
+	  If you say yes here you get support for the generic Intel PECI hwmon
+	  driver which provides Digital Thermal Sensor (DTS) thermal readings of
+	  DIMM components that are accessible using the PECI Client Command
+	  Suite via the processor PECI client.
+	  Check Documentation/hwmon/peci-dimmtemp for details.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called peci-dimmtemp.
+
 config SENSORS_NSA320
 	tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
 	depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)	+= nct7802.o
 obj-$(CONFIG_SENSORS_NCT7904)	+= nct7904.o
 obj-$(CONFIG_SENSORS_NSA320)	+= nsa320-hwmon.o
 obj-$(CONFIG_SENSORS_NTC_THERMISTOR)	+= ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
 obj-$(CONFIG_SENSORS_PC87360)	+= pc87360.o
 obj-$(CONFIG_SENSORS_PC87427)	+= pc87427.o
 obj-$(CONFIG_SENSORS_PCF8591)	+= pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
new file mode 100644
index 000000000000..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/delay.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+
+#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
+#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
+
+#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+	CPU_GEN_HSX, /* Haswell Xeon */
+	CPU_GEN_BRX, /* Broadwell Xeon */
+	CPU_GEN_SKX, /* Skylake Xeon */
+	CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+	u32 type;
+	u32 cpu_id;
+	u32 core_max;
+};
+
+struct temp_data {
+	bool valid;
+	s32  value;
+	unsigned long last_updated;
+};
+
+struct temp_group {
+	struct temp_data die;
+	struct temp_data dts_margin;
+	struct temp_data tcontrol;
+	struct temp_data tthrottle;
+	struct temp_data tjmax;
+	struct temp_data core[CORETEMP_CHANNEL_NUMS];
+};
+
+struct peci_cputemp {
+	struct peci_client *client;
+	struct device *dev;
+	char name[PECI_NAME_SIZE];
+	struct temp_group temp;
+	u8 addr;
+	uint cpu_no;
+	const struct cpu_gen_info *gen_info;
+	u32 core_mask;
+	u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
+	uint config_idx;
+	struct hwmon_channel_info temp_info;
+	const struct hwmon_channel_info *info[2];
+	struct hwmon_chip_info chip;
+};
+
+enum cputemp_channels {
+	channel_die,
+	channel_dts_mrgn,
+	channel_tcontrol,
+	channel_tthrottle,
+	channel_tjmax,
+	channel_core,
+};
+
+static const struct cpu_gen_info cpu_gen_info_table[] = {
+	{ .type = CPU_GEN_HSX,
+	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
+	  .core_max = CORE_MAX_ON_HSX },
+	{ .type = CPU_GEN_BRX,
+	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
+	  .core_max = CORE_MAX_ON_BDX },
+	{ .type = CPU_GEN_SKX,
+	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
+	  .core_max = CORE_MAX_ON_SKX },
+};
+
+static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
+	/* Die temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
+	HWMON_T_CRIT_HYST,
+
+	/* DTS margin temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
+
+	/* Tcontrol temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
+
+	/* Tthrottle temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+
+	/* Tjmax temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+
+	/* Core temperature - for all core channels */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
+	HWMON_T_CRIT_HYST,
+};
+
+static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
+	"Die",
+	"DTS margin",
+	"Tcontrol",
+	"Tthrottle",
+	"Tjmax",
+	"Core 0", "Core 1", "Core 2", "Core 3",
+	"Core 4", "Core 5", "Core 6", "Core 7",
+	"Core 8", "Core 9", "Core 10", "Core 11",
+	"Core 12", "Core 13", "Core 14", "Core 15",
+	"Core 16", "Core 17", "Core 18", "Core 19",
+	"Core 20", "Core 21", "Core 22", "Core 23",
+};
+
+static int send_peci_cmd(struct peci_cputemp *priv,
+			 enum peci_cmd cmd,
+			 void *msg)
+{
+	return peci_command(priv->client->adapter, cmd, msg);
+}
+
+static int need_update(struct temp_data *temp)
+{
+	if (temp->valid &&
+	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
+		return 0;
+
+	return 1;
+}
+
+static void mark_updated(struct temp_data *temp)
+{
+	temp->valid = true;
+	temp->last_updated = jiffies;
+}
+
+static s32 ten_dot_six_to_millidegree(s32 val)
+{
+	return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
+}
+
+static int get_tjmax(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	int rc;
+
+	if (!priv->temp.tjmax.valid) {
+		msg.addr = priv->addr;
+		msg.index = MBX_INDEX_TEMP_TARGET;
+		msg.param = 0;
+		msg.rx_len = 4;
+
+		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+		if (rc)
+			return rc;
+
+		priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
+		priv->temp.tjmax.valid = true;
+	}
+
+	return 0;
+}
+
+static int get_tcontrol(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 tcontrol_margin;
+	s32 tthrottle_offset;
+	int rc;
+
+	if (!need_update(&priv->temp.tcontrol))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_TEMP_TARGET;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	tcontrol_margin = msg.pkg_config[1];
+	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
+	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
+
+	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
+	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
+
+	mark_updated(&priv->temp.tcontrol);
+	mark_updated(&priv->temp.tthrottle);
+
+	return 0;
+}
+
+static int get_tthrottle(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 tcontrol_margin;
+	s32 tthrottle_offset;
+	int rc;
+
+	if (!need_update(&priv->temp.tthrottle))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_TEMP_TARGET;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
+	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
+
+	tcontrol_margin = msg.pkg_config[1];
+	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
+	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
+
+	mark_updated(&priv->temp.tthrottle);
+	mark_updated(&priv->temp.tcontrol);
+
+	return 0;
+}
+
+static int get_die_temp(struct peci_cputemp *priv)
+{
+	struct peci_get_temp_msg msg;
+	int rc;
+
+	if (!need_update(&priv->temp.die))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+
+	rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
+	if (rc)
+		return rc;
+
+	priv->temp.die.value = priv->temp.tjmax.value +
+			       ((s32)msg.temp_raw * 1000 / 64);
+
+	mark_updated(&priv->temp.die);
+
+	return 0;
+}
+
+static int get_dts_margin(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 dts_margin;
+	int rc;
+
+	if (!need_update(&priv->temp.dts_margin))
+		return 0;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_DTS_MARGIN;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
+
+	/**
+	 * Processors return a value of DTS reading in 10.6 format
+	 * (10 bits signed decimal, 6 bits fractional).
+	 * Error codes:
+	 *   0x8000: General sensor error
+	 *   0x8001: Reserved
+	 *   0x8002: Underflow on reading value
+	 *   0x8003-0x81ff: Reserved
+	 */
+	if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
+		return -EIO;
+
+	dts_margin = ten_dot_six_to_millidegree(dts_margin);
+
+	priv->temp.dts_margin.value = dts_margin;
+
+	mark_updated(&priv->temp.dts_margin);
+
+	return 0;
+}
+
+static int get_core_temp(struct peci_cputemp *priv, int core_index)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 core_dts_margin;
+	int rc;
+
+	if (!need_update(&priv->temp.core[core_index]))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
+	msg.param = core_index;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
+
+	/**
+	 * Processors return a value of the core DTS reading in 10.6 format
+	 * (10 bits signed decimal, 6 bits fractional).
+	 * Error codes:
+	 *   0x8000: General sensor error
+	 *   0x8001: Reserved
+	 *   0x8002: Underflow on reading value
+	 *   0x8003-0x81ff: Reserved
+	 */
+	if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
+		return -EIO;
+
+	core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
+
+	priv->temp.core[core_index].value = priv->temp.tjmax.value +
+					    core_dts_margin;
+
+	mark_updated(&priv->temp.core[core_index]);
+
+	return 0;
+}
+
+static int find_core_index(struct peci_cputemp *priv, int channel)
+{
+	int core_channel = channel - DEFAULT_CHANNEL_NUMS;
+	int idx, found = 0;
+
+	for (idx = 0; idx < priv->gen_info->core_max; idx++) {
+		if (priv->core_mask & BIT(idx)) {
+			if (core_channel == found)
+				break;
+
+			found++;
+		}
+	}
+
+	return idx;
+}
+
+static int cputemp_read_string(struct device *dev,
+			       enum hwmon_sensor_types type,
+			       u32 attr, int channel, const char **str)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int core_index;
+
+	switch (attr) {
+	case hwmon_temp_label:
+		if (channel < DEFAULT_CHANNEL_NUMS) {
+			*str = cputemp_label[channel];
+		} else {
+			core_index = find_core_index(priv, channel);
+			*str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
+		}
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_die(struct device *dev,
+			    enum hwmon_sensor_types type,
+			    u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_die_temp(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.die.value;
+		return 0;
+	case hwmon_temp_max:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	case hwmon_temp_crit_hyst:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_dts_margin(struct device *dev,
+				   enum hwmon_sensor_types type,
+				   u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_dts_margin(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.dts_margin.value;
+		return 0;
+	case hwmon_temp_min:
+		*val = 0;
+		return 0;
+	case hwmon_temp_lcrit:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tcontrol(struct device *dev,
+				 enum hwmon_sensor_types type,
+				 u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tthrottle(struct device *dev,
+				  enum hwmon_sensor_types type,
+				  u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tthrottle(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tthrottle.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tjmax(struct device *dev,
+			      enum hwmon_sensor_types type,
+			      u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_core(struct device *dev,
+			     enum hwmon_sensor_types type,
+			     u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int core_index = find_core_index(priv, channel);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_core_temp(priv, core_index);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.core[core_index].value;
+		return 0;
+	case hwmon_temp_max:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	case hwmon_temp_crit_hyst:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read(struct device *dev,
+			enum hwmon_sensor_types type,
+			u32 attr, int channel, long *val)
+{
+	switch (channel) {
+	case channel_die:
+		return cputemp_read_die(dev, type, attr, channel, val);
+	case channel_dts_mrgn:
+		return cputemp_read_dts_margin(dev, type, attr, channel, val);
+	case channel_tcontrol:
+		return cputemp_read_tcontrol(dev, type, attr, channel, val);
+	case channel_tthrottle:
+		return cputemp_read_tthrottle(dev, type, attr, channel, val);
+	case channel_tjmax:
+		return cputemp_read_tjmax(dev, type, attr, channel, val);
+	default:
+		if (channel < CPUTEMP_CHANNEL_NUMS)
+			return cputemp_read_core(dev, type, attr, channel, val);
+
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t cputemp_is_visible(const void *data,
+				  enum hwmon_sensor_types type,
+				  u32 attr, int channel)
+{
+	const struct peci_cputemp *priv = data;
+
+	if (priv->temp_config[channel] & BIT(attr))
+		return 0444;
+
+	return 0;
+}
+
+static const struct hwmon_ops cputemp_ops = {
+	.is_visible = cputemp_is_visible,
+	.read_string = cputemp_read_string,
+	.read = cputemp_read,
+};
+
+static int check_resolved_cores(struct peci_cputemp *priv)
+{
+	struct peci_rd_pci_cfg_local_msg msg;
+	int rc;
+
+	if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
+		return -EINVAL;
+
+	/* Get the RESOLVED_CORES register value */
+	msg.addr = priv->addr;
+	msg.bus = 1;
+	msg.device = 30;
+	msg.function = 3;
+	msg.reg = 0xB4;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
+	if (rc)
+		return rc;
+
+	priv->core_mask = msg.pci_config[3] << 24 |
+			  msg.pci_config[2] << 16 |
+			  msg.pci_config[1] << 8 |
+			  msg.pci_config[0];
+
+	if (!priv->core_mask)
+		return -EAGAIN;
+
+	dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
+	return 0;
+}
+
+static int create_core_temp_info(struct peci_cputemp *priv)
+{
+	int rc, i;
+
+	rc = check_resolved_cores(priv);
+	if (!rc) {
+		for (i = 0; i < priv->gen_info->core_max; i++) {
+			if (priv->core_mask & BIT(i)) {
+				priv->temp_config[priv->config_idx++] =
+						     config_table[channel_core];
+			}
+		}
+	}
+
+	return rc;
+}
+
+static int check_cpu_id(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	u32 cpu_id;
+	int i, rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_CPU_ID;
+	msg.param = PKG_ID_CPU_ID;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
+		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
+
+	for (i = 0; i < CPU_GEN_MAX; i++) {
+		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
+			priv->gen_info = &cpu_gen_info_table[i];
+			break;
+		}
+	}
+
+	if (!priv->gen_info)
+		return -ENODEV;
+
+	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
+	return 0;
+}
+
+static int peci_cputemp_probe(struct peci_client *client)
+{
+	struct device *dev = &client->dev;
+	struct peci_cputemp *priv;
+	struct device *hwmon_dev;
+	int rc;
+
+	if ((client->adapter->cmd_mask &
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
+		dev_err(dev, "Client doesn't support temperature monitoring\n");
+		return -EINVAL;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->client = client;
+	priv->dev = dev;
+	priv->addr = client->addr;
+	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
+
+	snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
+		 priv->cpu_no);
+
+	rc = check_cpu_id(priv);
+	if (rc) {
+		dev_err(dev, "Client CPU is not supported\n");
+		return rc;
+	}
+
+	priv->temp_config[priv->config_idx++] = config_table[channel_die];
+	priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
+
+	rc = create_core_temp_info(priv);
+	if (rc)
+		dev_dbg(dev, "Failed to create core temp info\n");
+
+	priv->chip.ops = &cputemp_ops;
+	priv->chip.info = priv->info;
+
+	priv->info[0] = &priv->temp_info;
+
+	priv->temp_info.type = hwmon_temp;
+	priv->temp_info.config = priv->temp_config;
+
+	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
+							 priv->name,
+							 priv,
+							 &priv->chip,
+							 NULL);
+
+	if (IS_ERR(hwmon_dev))
+		return PTR_ERR(hwmon_dev);
+
+	dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
+
+	return 0;
+}
+
+static const struct of_device_id peci_cputemp_of_table[] = {
+	{ .compatible = "intel,peci-cputemp" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
+
+static struct peci_driver peci_cputemp_driver = {
+	.probe  = peci_cputemp_probe,
+	.driver = {
+		.name           = "peci-cputemp",
+		.of_match_table = of_match_ptr(peci_cputemp_of_table),
+	},
+};
+module_peci_driver(peci_cputemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI cputemp driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
new file mode 100644
index 000000000000..78bf29cb2c4c
--- /dev/null
+++ b/drivers/hwmon/peci-dimmtemp.c
@@ -0,0 +1,432 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/delay.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+#include <linux/workqueue.h>
+
+#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
+
+#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
+#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
+
+#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
+#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
+
+#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
+#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
+
+#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
+#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
+
+#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
+
+#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN  HZ
+
+#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
+#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
+
+enum cpu_gens {
+	CPU_GEN_HSX, /* Haswell Xeon */
+	CPU_GEN_BRX, /* Broadwell Xeon */
+	CPU_GEN_SKX, /* Skylake Xeon */
+	CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+	u32 type;
+	u32 cpu_id;
+	u32 chan_rank_max;
+	u32 dimm_idx_max;
+};
+
+struct temp_data {
+	bool valid;
+	s32  value;
+	unsigned long last_updated;
+};
+
+struct peci_dimmtemp {
+	struct peci_client *client;
+	struct device *dev;
+	struct workqueue_struct *work_queue;
+	struct delayed_work work_handler;
+	char name[PECI_NAME_SIZE];
+	struct temp_data temp[DIMM_NUMS_MAX];
+	u8 addr;
+	uint cpu_no;
+	const struct cpu_gen_info *gen_info;
+	u32 dimm_mask;
+	int retry_count;
+	int channels;
+	u32 temp_config[DIMM_NUMS_MAX + 1];
+	struct hwmon_channel_info temp_info;
+	const struct hwmon_channel_info *info[2];
+	struct hwmon_chip_info chip;
+};
+
+static const struct cpu_gen_info cpu_gen_info_table[] = {
+	{ .type  = CPU_GEN_HSX,
+	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
+	{ .type  = CPU_GEN_BRX,
+	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
+	{ .type  = CPU_GEN_SKX,
+	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
+};
+
+static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
+	{ "DIMM A0", "DIMM A1", "DIMM A2" },
+	{ "DIMM B0", "DIMM B1", "DIMM B2" },
+	{ "DIMM C0", "DIMM C1", "DIMM C2" },
+	{ "DIMM D0", "DIMM D1", "DIMM D2" },
+	{ "DIMM E0", "DIMM E1", "DIMM E2" },
+	{ "DIMM F0", "DIMM F1", "DIMM F2" },
+	{ "DIMM G0", "DIMM G1", "DIMM G2" },
+	{ "DIMM H0", "DIMM H1", "DIMM H2" },
+};
+
+static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
+			 void *msg)
+{
+	return peci_command(priv->client->adapter, cmd, msg);
+}
+
+static int need_update(struct temp_data *temp)
+{
+	if (temp->valid &&
+	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
+		return 0;
+
+	return 1;
+}
+
+static void mark_updated(struct temp_data *temp)
+{
+	temp->valid = true;
+	temp->last_updated = jiffies;
+}
+
+static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	struct peci_rd_pkg_cfg_msg msg;
+	int rc;
+
+	if (!need_update(&priv->temp[dimm_no]))
+		return 0;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_DDR_DIMM_TEMP;
+	msg.param = chan_rank;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
+
+	mark_updated(&priv->temp[dimm_no]);
+
+	return 0;
+}
+
+static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
+{
+	int dimm_nums_max = priv->gen_info->chan_rank_max *
+			    priv->gen_info->dimm_idx_max;
+	int idx, found = 0;
+
+	for (idx = 0; idx < dimm_nums_max; idx++) {
+		if (priv->dimm_mask & BIT(idx)) {
+			if (channel == found)
+				break;
+
+			found++;
+		}
+	}
+
+	return idx;
+}
+
+static int dimmtemp_read_string(struct device *dev,
+				enum hwmon_sensor_types type,
+				u32 attr, int channel, const char **str)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
+	int dimm_no, chan_rank, dimm_idx;
+
+	switch (attr) {
+	case hwmon_temp_label:
+		dimm_no = find_dimm_number(priv, channel);
+		chan_rank = dimm_no / dimm_idx_max;
+		dimm_idx = dimm_no % dimm_idx_max;
+		*str = dimmtemp_label[chan_rank][dimm_idx];
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
+			 u32 attr, int channel, long *val)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+	int dimm_no = find_dimm_number(priv, channel);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_dimm_temp(priv, dimm_no);
+		if (rc)
+			return rc;
+
+		*val = priv->temp[dimm_no].value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t dimmtemp_is_visible(const void *data,
+				   enum hwmon_sensor_types type,
+				   u32 attr, int channel)
+{
+	switch (attr) {
+	case hwmon_temp_label:
+	case hwmon_temp_input:
+		return 0444;
+	default:
+		return 0;
+	}
+}
+
+static const struct hwmon_ops dimmtemp_ops = {
+	.is_visible = dimmtemp_is_visible,
+	.read_string = dimmtemp_read_string,
+	.read = dimmtemp_read,
+};
+
+static int check_populated_dimms(struct peci_dimmtemp *priv)
+{
+	u32 chan_rank_max = priv->gen_info->chan_rank_max;
+	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
+	struct peci_rd_pkg_cfg_msg msg;
+	int chan_rank, dimm_idx;
+	int rc, channels = 0;
+
+	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
+		msg.addr = priv->addr;
+		msg.index = MBX_INDEX_DDR_DIMM_TEMP;
+		msg.param = chan_rank;
+		msg.rx_len = 4;
+
+		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+		if (rc) {
+			priv->dimm_mask = 0;
+			return rc;
+		}
+
+		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
+			if (msg.pkg_config[dimm_idx]) {
+				priv->dimm_mask |= BIT(chan_rank *
+						       chan_rank_max +
+						       dimm_idx);
+				channels++;
+			}
+		}
+	}
+
+	if (!priv->dimm_mask)
+		return -EAGAIN;
+
+	priv->channels = channels;
+
+	dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
+	return 0;
+}
+
+static int create_dimm_temp_info(struct peci_dimmtemp *priv)
+{
+	struct device *hwmon_dev;
+	int rc, i;
+
+	rc = check_populated_dimms(priv);
+	if (!rc) {
+		for (i = 0; i < priv->channels; i++)
+			priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
+
+		priv->chip.ops = &dimmtemp_ops;
+		priv->chip.info = priv->info;
+
+		priv->info[0] = &priv->temp_info;
+
+		priv->temp_info.type = hwmon_temp;
+		priv->temp_info.config = priv->temp_config;
+
+		hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
+								 priv->name,
+								 priv,
+								 &priv->chip,
+								 NULL);
+		rc = PTR_ERR_OR_ZERO(hwmon_dev);
+		if (!rc)
+			dev_dbg(priv->dev, "%s: sensor '%s'\n",
+				dev_name(hwmon_dev), priv->name);
+	} else if (rc == -EAGAIN) {
+		if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
+			queue_delayed_work(priv->work_queue,
+					   &priv->work_handler,
+					   DIMM_MASK_CHECK_DELAY_JIFFIES);
+			priv->retry_count++;
+			dev_dbg(priv->dev,
+				"Deferred DIMM temp info creation\n");
+		} else {
+			rc = -ETIMEDOUT;
+			dev_err(priv->dev,
+				"Timeout retrying DIMM temp info creation\n");
+		}
+	}
+
+	return rc;
+}
+
+static void create_dimm_temp_info_delayed(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
+						  work_handler);
+	int rc;
+
+	rc = create_dimm_temp_info(priv);
+	if (rc && rc != -EAGAIN)
+		dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
+}
+
+static int check_cpu_id(struct peci_dimmtemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	u32 cpu_id;
+	int i, rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_CPU_ID;
+	msg.param = PKG_ID_CPU_ID;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
+		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
+
+	for (i = 0; i < CPU_GEN_MAX; i++) {
+		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
+			priv->gen_info = &cpu_gen_info_table[i];
+			break;
+		}
+	}
+
+	if (!priv->gen_info)
+		return -ENODEV;
+
+	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
+	return 0;
+}
+
+static int peci_dimmtemp_probe(struct peci_client *client)
+{
+	struct device *dev = &client->dev;
+	struct peci_dimmtemp *priv;
+	int rc;
+
+	if ((client->adapter->cmd_mask &
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
+		dev_err(dev, "Client doesn't support temperature monitoring\n");
+		return -EINVAL;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->client = client;
+	priv->dev = dev;
+	priv->addr = client->addr;
+	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
+
+	snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
+		 priv->cpu_no);
+
+	rc = check_cpu_id(priv);
+	if (rc) {
+		dev_err(dev, "Client CPU is not supported\n");
+		return rc;
+	}
+
+	priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
+	if (!priv->work_queue)
+		return -ENOMEM;
+
+	INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
+
+	rc = create_dimm_temp_info(priv);
+	if (rc && rc != -EAGAIN) {
+		dev_err(dev, "Failed to create DIMM temp info\n");
+		goto err_free_wq;
+	}
+
+	return 0;
+
+err_free_wq:
+	destroy_workqueue(priv->work_queue);
+	return rc;
+}
+
+static int peci_dimmtemp_remove(struct peci_client *client)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
+
+	cancel_delayed_work(&priv->work_handler);
+	destroy_workqueue(priv->work_queue);
+
+	return 0;
+}
+
+static const struct of_device_id peci_dimmtemp_of_table[] = {
+	{ .compatible = "intel,peci-dimmtemp" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
+
+static struct peci_driver peci_dimmtemp_driver = {
+	.probe  = peci_dimmtemp_probe,
+	.remove = peci_dimmtemp_remove,
+	.driver = {
+		.name           = "peci-dimmtemp",
+		.of_match_table = of_match_ptr(peci_dimmtemp_of_table),
+	},
+};
+module_peci_driver(peci_dimmtemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI dimmtemp driver");
+MODULE_LICENSE("GPL v2");
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v3 10/10] Add a maintainer for the PECI subsystem
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo
In-Reply-To: <20180410183212.16787-1-jae.hyun.yoo@linux.intel.com>

This commit adds a maintainer information for the PECI subsystem.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 MAINTAINERS | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5cd5ff0e4428..3e6917e1ad31 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10965,6 +10965,16 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI SUBSYSTEM
+M:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+M:	Jason M Biils <jason.m.bills@linux.intel.com>
+S:	Maintained
+F:	Documentation/devicetree/bindings/peci/
+F:	drivers/peci/
+F:	drivers/hwmon/peci-*.c
+F:	include/linux/peci.h
+F:	include/uapi/linux/peci-ioctl.h
+
 PER-CPU MEMORY ALLOCATOR
 M:	Tejun Heo <tj@kernel.org>
 M:	Christoph Lameter <cl@linux.com>
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [RFC bpf-next v2 1/8] bpf: add script and prepare bpf.h for new helpers documentation
From: Alexei Starovoitov @ 2018-04-10 18:16 UTC (permalink / raw)
  To: Quentin Monnet; +Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man
In-Reply-To: <20180410144157.4831-2-quentin.monnet@netronome.com>

On Tue, Apr 10, 2018 at 03:41:50PM +0100, Quentin Monnet wrote:
> Remove previous "overview" of eBPF helpers from user bpf.h header.
> Replace it by a comment explaining how to process the new documentation
> (to come in following patches) with a Python script to produce RST, then
> man page documentation.
> 
> Also add the aforementioned Python script under scripts/. It is used to
> process include/uapi/linux/bpf.h and to extract helper descriptions, to
> turn it into a RST document that can further be processed with rst2man
> to produce a man page. The script takes one "--filename <path/to/file>"
> option. If the script is launched from scripts/ in the kernel root
> directory, it should be able to find the location of the header to
> parse, and "--filename <path/to/file>" is then optional. If it cannot
> find the file, then the option becomes mandatory. RST-formatted
> documentation is printed to standard output.
> 
> Typical workflow for producing the final man page would be:
> 
>     $ ./scripts/bpf_helpers_doc.py \
>             --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
>     $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
>     $ man /tmp/bpf-helpers.7
> 
> Note that the tool kernel-doc cannot be used to document eBPF helpers,
> whose signatures are not available directly in the header files
> (pre-processor directives are used to produce them at the beginning of
> the compilation process).
> 
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
>  include/uapi/linux/bpf.h   | 406 ++------------------------------------------
>  scripts/bpf_helpers_doc.py | 414 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 430 insertions(+), 390 deletions(-)
>  create mode 100755 scripts/bpf_helpers_doc.py
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c5ec89732a8d..45f77f01e672 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -365,396 +365,22 @@ union bpf_attr {
>  	} raw_tracepoint;
>  } __attribute__((aligned(8)));
>  
> -/* BPF helper function descriptions:
> - *
> - * void *bpf_map_lookup_elem(&map, &key)
> - *     Return: Map value or NULL
> - *
> - * int bpf_map_update_elem(&map, &key, &value, flags)
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_map_delete_elem(&map, &key)
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_probe_read(void *dst, int size, void *src)
> - *     Return: 0 on success or negative error
> - *
> - * u64 bpf_ktime_get_ns(void)
> - *     Return: current ktime
> - *
> - * int bpf_trace_printk(const char *fmt, int fmt_size, ...)
> - *     Return: length of buffer written or negative error
> - *
> - * u32 bpf_prandom_u32(void)
> - *     Return: random value
> - *
> - * u32 bpf_raw_smp_processor_id(void)
> - *     Return: SMP processor ID
> - *
> - * int bpf_skb_store_bytes(skb, offset, from, len, flags)
> - *     store bytes into packet
> - *     @skb: pointer to skb
> - *     @offset: offset within packet from skb->mac_header
> - *     @from: pointer where to copy bytes from
> - *     @len: number of bytes to store into packet
> - *     @flags: bit 0 - if true, recompute skb->csum
> - *             other bits - reserved
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_l3_csum_replace(skb, offset, from, to, flags)
> - *     recompute IP checksum
> - *     @skb: pointer to skb
> - *     @offset: offset within packet where IP checksum is located
> - *     @from: old value of header field
> - *     @to: new value of header field
> - *     @flags: bits 0-3 - size of header field
> - *             other bits - reserved
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_l4_csum_replace(skb, offset, from, to, flags)
> - *     recompute TCP/UDP checksum
> - *     @skb: pointer to skb
> - *     @offset: offset within packet where TCP/UDP checksum is located
> - *     @from: old value of header field
> - *     @to: new value of header field
> - *     @flags: bits 0-3 - size of header field
> - *             bit 4 - is pseudo header
> - *             other bits - reserved
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_tail_call(ctx, prog_array_map, index)
> - *     jump into another BPF program
> - *     @ctx: context pointer passed to next program
> - *     @prog_array_map: pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY
> - *     @index: 32-bit index inside array that selects specific program to run
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_clone_redirect(skb, ifindex, flags)
> - *     redirect to another netdev
> - *     @skb: pointer to skb
> - *     @ifindex: ifindex of the net device
> - *     @flags: bit 0 - if set, redirect to ingress instead of egress
> - *             other bits - reserved
> - *     Return: 0 on success or negative error
> - *
> - * u64 bpf_get_current_pid_tgid(void)
> - *     Return: current->tgid << 32 | current->pid
> - *
> - * u64 bpf_get_current_uid_gid(void)
> - *     Return: current_gid << 32 | current_uid
> - *
> - * int bpf_get_current_comm(char *buf, int size_of_buf)
> - *     stores current->comm into buf
> - *     Return: 0 on success or negative error
> - *
> - * u32 bpf_get_cgroup_classid(skb)
> - *     retrieve a proc's classid
> - *     @skb: pointer to skb
> - *     Return: classid if != 0
> - *
> - * int bpf_skb_vlan_push(skb, vlan_proto, vlan_tci)
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_vlan_pop(skb)
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_get_tunnel_key(skb, key, size, flags)
> - * int bpf_skb_set_tunnel_key(skb, key, size, flags)
> - *     retrieve or populate tunnel metadata
> - *     @skb: pointer to skb
> - *     @key: pointer to 'struct bpf_tunnel_key'
> - *     @size: size of 'struct bpf_tunnel_key'
> - *     @flags: room for future extensions
> - *     Return: 0 on success or negative error
> - *
> - * u64 bpf_perf_event_read(map, flags)
> - *     read perf event counter value
> - *     @map: pointer to perf_event_array map
> - *     @flags: index of event in the map or bitmask flags
> - *     Return: value of perf event counter read or error code
> - *
> - * int bpf_redirect(ifindex, flags)
> - *     redirect to another netdev
> - *     @ifindex: ifindex of the net device
> - *     @flags:
> - *	  cls_bpf:
> - *          bit 0 - if set, redirect to ingress instead of egress
> - *          other bits - reserved
> - *	  xdp_bpf:
> - *	    all bits - reserved
> - *     Return: cls_bpf: TC_ACT_REDIRECT on success or TC_ACT_SHOT on error
> - *	       xdp_bfp: XDP_REDIRECT on success or XDP_ABORT on error
> - * int bpf_redirect_map(map, key, flags)
> - *     redirect to endpoint in map
> - *     @map: pointer to dev map
> - *     @key: index in map to lookup
> - *     @flags: --
> - *     Return: XDP_REDIRECT on success or XDP_ABORT on error
> - *
> - * u32 bpf_get_route_realm(skb)
> - *     retrieve a dst's tclassid
> - *     @skb: pointer to skb
> - *     Return: realm if != 0
> - *
> - * int bpf_perf_event_output(ctx, map, flags, data, size)
> - *     output perf raw sample
> - *     @ctx: struct pt_regs*
> - *     @map: pointer to perf_event_array map
> - *     @flags: index of event in the map or bitmask flags
> - *     @data: data on stack to be output as raw data
> - *     @size: size of data
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_get_stackid(ctx, map, flags)
> - *     walk user or kernel stack and return id
> - *     @ctx: struct pt_regs*
> - *     @map: pointer to stack_trace map
> - *     @flags: bits 0-7 - numer of stack frames to skip
> - *             bit 8 - collect user stack instead of kernel
> - *             bit 9 - compare stacks by hash only
> - *             bit 10 - if two different stacks hash into the same stackid
> - *                      discard old
> - *             other bits - reserved
> - *     Return: >= 0 stackid on success or negative error
> - *
> - * s64 bpf_csum_diff(from, from_size, to, to_size, seed)
> - *     calculate csum diff
> - *     @from: raw from buffer
> - *     @from_size: length of from buffer
> - *     @to: raw to buffer
> - *     @to_size: length of to buffer
> - *     @seed: optional seed
> - *     Return: csum result or negative error code
> - *
> - * int bpf_skb_get_tunnel_opt(skb, opt, size)
> - *     retrieve tunnel options metadata
> - *     @skb: pointer to skb
> - *     @opt: pointer to raw tunnel option data
> - *     @size: size of @opt
> - *     Return: option size
> - *
> - * int bpf_skb_set_tunnel_opt(skb, opt, size)
> - *     populate tunnel options metadata
> - *     @skb: pointer to skb
> - *     @opt: pointer to raw tunnel option data
> - *     @size: size of @opt
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_change_proto(skb, proto, flags)
> - *     Change protocol of the skb. Currently supported is v4 -> v6,
> - *     v6 -> v4 transitions. The helper will also resize the skb. eBPF
> - *     program is expected to fill the new headers via skb_store_bytes
> - *     and lX_csum_replace.
> - *     @skb: pointer to skb
> - *     @proto: new skb->protocol type
> - *     @flags: reserved
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_change_type(skb, type)
> - *     Change packet type of skb.
> - *     @skb: pointer to skb
> - *     @type: new skb->pkt_type type
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_under_cgroup(skb, map, index)
> - *     Check cgroup2 membership of skb
> - *     @skb: pointer to skb
> - *     @map: pointer to bpf_map in BPF_MAP_TYPE_CGROUP_ARRAY type
> - *     @index: index of the cgroup in the bpf_map
> - *     Return:
> - *       == 0 skb failed the cgroup2 descendant test
> - *       == 1 skb succeeded the cgroup2 descendant test
> - *        < 0 error
> - *
> - * u32 bpf_get_hash_recalc(skb)
> - *     Retrieve and possibly recalculate skb->hash.
> - *     @skb: pointer to skb
> - *     Return: hash
> - *
> - * u64 bpf_get_current_task(void)
> - *     Returns current task_struct
> - *     Return: current
> - *
> - * int bpf_probe_write_user(void *dst, void *src, int len)
> - *     safely attempt to write to a location
> - *     @dst: destination address in userspace
> - *     @src: source address on stack
> - *     @len: number of bytes to copy
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_current_task_under_cgroup(map, index)
> - *     Check cgroup2 membership of current task
> - *     @map: pointer to bpf_map in BPF_MAP_TYPE_CGROUP_ARRAY type
> - *     @index: index of the cgroup in the bpf_map
> - *     Return:
> - *       == 0 current failed the cgroup2 descendant test
> - *       == 1 current succeeded the cgroup2 descendant test
> - *        < 0 error
> - *
> - * int bpf_skb_change_tail(skb, len, flags)
> - *     The helper will resize the skb to the given new size, to be used f.e.
> - *     with control messages.
> - *     @skb: pointer to skb
> - *     @len: new skb length
> - *     @flags: reserved
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_skb_pull_data(skb, len)
> - *     The helper will pull in non-linear data in case the skb is non-linear
> - *     and not all of len are part of the linear section. Only needed for
> - *     read/write with direct packet access.
> - *     @skb: pointer to skb
> - *     @len: len to make read/writeable
> - *     Return: 0 on success or negative error
> - *
> - * s64 bpf_csum_update(skb, csum)
> - *     Adds csum into skb->csum in case of CHECKSUM_COMPLETE.
> - *     @skb: pointer to skb
> - *     @csum: csum to add
> - *     Return: csum on success or negative error
> - *
> - * void bpf_set_hash_invalid(skb)
> - *     Invalidate current skb->hash.
> - *     @skb: pointer to skb
> - *
> - * int bpf_get_numa_node_id()
> - *     Return: Id of current NUMA node.
> - *
> - * int bpf_skb_change_head()
> - *     Grows headroom of skb and adjusts MAC header offset accordingly.
> - *     Will extends/reallocae as required automatically.
> - *     May change skb data pointer and will thus invalidate any check
> - *     performed for direct packet access.
> - *     @skb: pointer to skb
> - *     @len: length of header to be pushed in front
> - *     @flags: Flags (unused for now)
> - *     Return: 0 on success or negative error
> - *
> - * int bpf_xdp_adjust_head(xdp_md, delta)
> - *     Adjust the xdp_md.data by delta
> - *     @xdp_md: pointer to xdp_md
> - *     @delta: An positive/negative integer to be added to xdp_md.data
> - *     Return: 0 on success or negative on error
> - *
> - * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
> - *     Copy a NUL terminated string from unsafe address. In case the string
> - *     length is smaller than size, the target is not padded with further NUL
> - *     bytes. In case the string length is larger than size, just count-1
> - *     bytes are copied and the last byte is set to NUL.
> - *     @dst: destination address
> - *     @size: maximum number of bytes to copy, including the trailing NUL
> - *     @unsafe_ptr: unsafe address
> - *     Return:
> - *       > 0 length of the string including the trailing NUL on success
> - *       < 0 error
> - *
> - * u64 bpf_get_socket_cookie(skb)
> - *     Get the cookie for the socket stored inside sk_buff.
> - *     @skb: pointer to skb
> - *     Return: 8 Bytes non-decreasing number on success or 0 if the socket
> - *     field is missing inside sk_buff
> - *
> - * u32 bpf_get_socket_uid(skb)
> - *     Get the owner uid of the socket stored inside sk_buff.
> - *     @skb: pointer to skb
> - *     Return: uid of the socket owner on success or overflowuid if failed.
> - *
> - * u32 bpf_set_hash(skb, hash)
> - *     Set full skb->hash.
> - *     @skb: pointer to skb
> - *     @hash: hash to set
> - *
> - * int bpf_setsockopt(bpf_socket, level, optname, optval, optlen)
> - *     Calls setsockopt. Not all opts are available, only those with
> - *     integer optvals plus TCP_CONGESTION.
> - *     Supported levels: SOL_SOCKET and IPPROTO_TCP
> - *     @bpf_socket: pointer to bpf_socket
> - *     @level: SOL_SOCKET or IPPROTO_TCP
> - *     @optname: option name
> - *     @optval: pointer to option value
> - *     @optlen: length of optval in bytes
> - *     Return: 0 or negative error
> - *
> - * int bpf_getsockopt(bpf_socket, level, optname, optval, optlen)
> - *     Calls getsockopt. Not all opts are available.
> - *     Supported levels: IPPROTO_TCP
> - *     @bpf_socket: pointer to bpf_socket
> - *     @level: IPPROTO_TCP
> - *     @optname: option name
> - *     @optval: pointer to option value
> - *     @optlen: length of optval in bytes
> - *     Return: 0 or negative error
> - *
> - * int bpf_sock_ops_cb_flags_set(bpf_sock_ops, flags)
> - *     Set callback flags for sock_ops
> - *     @bpf_sock_ops: pointer to bpf_sock_ops_kern struct
> - *     @flags: flags value
> - *     Return: 0 for no error
> - *             -EINVAL if there is no full tcp socket
> - *             bits in flags that are not supported by current kernel
> - *
> - * int bpf_skb_adjust_room(skb, len_diff, mode, flags)
> - *     Grow or shrink room in sk_buff.
> - *     @skb: pointer to skb
> - *     @len_diff: (signed) amount of room to grow/shrink
> - *     @mode: operation mode (enum bpf_adj_room_mode)
> - *     @flags: reserved for future use
> - *     Return: 0 on success or negative error code
> - *
> - * int bpf_sk_redirect_map(map, key, flags)
> - *     Redirect skb to a sock in map using key as a lookup key for the
> - *     sock in map.
> - *     @map: pointer to sockmap
> - *     @key: key to lookup sock in map
> - *     @flags: reserved for future use
> - *     Return: SK_PASS
> - *
> - * int bpf_sock_map_update(skops, map, key, flags)
> - *	@skops: pointer to bpf_sock_ops
> - *	@map: pointer to sockmap to update
> - *	@key: key to insert/update sock in map
> - *	@flags: same flags as map update elem
> - *
> - * int bpf_xdp_adjust_meta(xdp_md, delta)
> - *     Adjust the xdp_md.data_meta by delta
> - *     @xdp_md: pointer to xdp_md
> - *     @delta: An positive/negative integer to be added to xdp_md.data_meta
> - *     Return: 0 on success or negative on error
> - *
> - * int bpf_perf_event_read_value(map, flags, buf, buf_size)
> - *     read perf event counter value and perf event enabled/running time
> - *     @map: pointer to perf_event_array map
> - *     @flags: index of event in the map or bitmask flags
> - *     @buf: buf to fill
> - *     @buf_size: size of the buf
> - *     Return: 0 on success or negative error code
> - *
> - * int bpf_perf_prog_read_value(ctx, buf, buf_size)
> - *     read perf prog attached perf event counter and enabled/running time
> - *     @ctx: pointer to ctx
> - *     @buf: buf to fill
> - *     @buf_size: size of the buf
> - *     Return : 0 on success or negative error code
> - *
> - * int bpf_override_return(pt_regs, rc)
> - *	@pt_regs: pointer to struct pt_regs
> - *	@rc: the return value to set
> - *
> - * int bpf_msg_redirect_map(map, key, flags)
> - *     Redirect msg to a sock in map using key as a lookup key for the
> - *     sock in map.
> - *     @map: pointer to sockmap
> - *     @key: key to lookup sock in map
> - *     @flags: reserved for future use
> - *     Return: SK_PASS
> - *
> - * int bpf_bind(ctx, addr, addr_len)
> - *     Bind socket to address. Only binding to IP is supported, no port can be
> - *     set in addr.
> - *     @ctx: pointer to context of type bpf_sock_addr
> - *     @addr: pointer to struct sockaddr to bind socket to
> - *     @addr_len: length of sockaddr structure
> - *     Return: 0 on success or negative error code
> +/* The description below is an attempt at providing documentation to eBPF
> + * developers about the multiple available eBPF helper functions. It can be
> + * parsed and used to produce a manual page. The workflow is the following,
> + * and requires the rst2man utility:
> + *
> + *     $ ./scripts/bpf_helpers_doc.py \
> + *             --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
> + *     $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
> + *     $ man /tmp/bpf-helpers.7
> + *
> + * Note that in order to produce this external documentation, some RST
> + * formatting is used in the descriptions to get "bold" and "italics" in
> + * manual pages. Also note that the few trailing white spaces are
> + * intentional, removing them would break paragraphs for rst2man.
> + *
> + * Start of BPF helper function descriptions:
>   */
>  #define __BPF_FUNC_MAPPER(FN)		\
>  	FN(unspec),			\
> diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py
> new file mode 100755
> index 000000000000..3a15ba3f0a83
> --- /dev/null
> +++ b/scripts/bpf_helpers_doc.py
> @@ -0,0 +1,414 @@
> +#!/usr/bin/python3
> +#
> +# Copyright (C) 2018 Netronome Systems, Inc.
> +#
> +# This software is licensed under the GNU General License Version 2,
> +# June 1991 as shown in the file COPYING in the top-level directory of this
> +# source tree.

please use SPDX instead.

> +#
> +# THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
> +# WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
> +# BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
> +# FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
> +# OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
> +# THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
> +
> +# In case user attempts to run with Python 2.
> +from __future__ import print_function
> +
> +import argparse
> +import re
> +import sys, os
> +
> +class NoHelperFound(BaseException):
> +    pass
> +
> +class ParsingError(BaseException):
> +    def __init__(self, line='<line not provided>', reader=None):
> +        if reader:
> +            BaseException.__init__(self,
> +                                   'Error at file offset %d, parsing line: %s' %
> +                                   (reader.tell(), line))
> +        else:
> +            BaseException.__init__(self, 'Error parsing line: %s' % line)
> +
> +class Helper(object):
> +    """
> +    An object representing the description of an eBPF helper function.
> +    @proto: function prototype of the helper function
> +    @desc: textual description of the helper function
> +    @ret: description of the return value of the helper function
> +    """
> +    def __init__(self, proto='', desc='', ret=''):
> +        self.proto = proto
> +        self.desc = desc
> +        self.ret = ret
> +
> +    def proto_break_down(self):
> +        """
> +        Break down helper function protocol into smaller chunks: return type,
> +        name, distincts arguments.
> +        """
> +        arg_re = re.compile('^((const )?(struct )?(\w+|...))( (\**)(\w+))?$')
> +        res = {}
> +        proto_re = re.compile('^(.+) (\**)(\w+)\(((([^,]+)(, )?){1,5})\)$')
> +
> +        capture = proto_re.match(self.proto)
> +        res['ret_type'] = capture.group(1)
> +        res['ret_star'] = capture.group(2)
> +        res['name']     = capture.group(3)
> +        res['args'] = []
> +
> +        args    = capture.group(4).split(', ')
> +        for a in args:
> +            capture = arg_re.match(a)
> +            res['args'].append({
> +                'type' : capture.group(1),
> +                'star' : capture.group(6),
> +                'name' : capture.group(7)
> +            })
> +
> +        return res
> +
> +class HeaderParser(object):
> +    """
> +    An object used to parse a file in order to extract the documentation of a
> +    list of eBPF helper functions. All the helpers that can be retrieved are
> +    stored as Helper object, in the self.helpers() array.
> +    @filename: name of file to parse, usually include/uapi/linux/bpf.h in the
> +               kernel tree
> +    """
> +    def __init__(self, filename):
> +        self.reader = open(filename, 'r')
> +        self.line = ''
> +        self.helpers = []
> +
> +    def parse_helper(self):
> +        proto    = self.parse_proto()
> +        desc     = self.parse_desc()
> +        ret      = self.parse_ret()
> +        return Helper(proto=proto, desc=desc, ret=ret)
> +
> +    def parse_proto(self):
> +        # Argument can be of shape:
> +        #   - "void"
> +        #   - "type  name"
> +        #   - "type *name"
> +        #   - Same as above, with "const" and/or "struct" in front of type
> +        #   - "..." (undefined number of arguments, for bpf_trace_printk())
> +        # There is at least one term ("void"), and at most five arguments.
> +        p = re.compile('^ \* ((.+) \**\w+\((((const )?(struct )?(\w+|\.\.\.)( \**\w+)?)(, )?){1,5}\))$')
> +        capture = p.match(self.line)
> +        if not capture:
> +            raise NoHelperFound
> +        self.line = self.reader.readline()
> +        return capture.group(1)
> +
> +    def parse_desc(self):
> +        p = re.compile('^ \* \tDescription$')
> +        capture = p.match(self.line)
> +        if not capture:
> +            # Helper can have empty description and we might be parsing another
> +            # attribute: return but do not consume.
> +            return ''
> +        # Description can be several lines, some of them possibly empty, and it
> +        # stops when another subsection title is met.
> +        desc = ''
> +        while True:
> +            self.line = self.reader.readline()
> +            if self.line == ' *\n':
> +                desc += '\n'
> +            else:
> +                p = re.compile('^ \* \t\t(.*)')
> +                capture = p.match(self.line)
> +                if capture:
> +                    desc += capture.group(1) + '\n'
> +                else:
> +                    break
> +        return desc
> +
> +    def parse_ret(self):
> +        p = re.compile('^ \* \tReturn$')
> +        capture = p.match(self.line)
> +        if not capture:
> +            # Helper can have empty retval and we might be parsing another
> +            # attribute: return but do not consume.
> +            return ''
> +        # Return value description can be several lines, some of them possibly
> +        # empty, and it stops when another subsection title is met.
> +        ret = ''
> +        while True:
> +            self.line = self.reader.readline()
> +            if self.line == ' *\n':
> +                ret += '\n'
> +            else:
> +                p = re.compile('^ \* \t\t(.*)')
> +                capture = p.match(self.line)
> +                if capture:
> +                    ret += capture.group(1) + '\n'
> +                else:
> +                    break
> +        return ret
> +
> +    def run(self):
> +        # Advance to start of helper function descriptions.
> +        offset = self.reader.read().find('* Start of BPF helper function descriptions:')
> +        if offset == -1:
> +            raise Exception('Could not find start of eBPF helper descriptions list')
> +        self.reader.seek(offset)
> +        self.reader.readline()
> +        self.reader.readline()
> +        self.line = self.reader.readline()
> +
> +        while True:
> +            try:
> +                helper = self.parse_helper()
> +                self.helpers.append(helper)
> +            except NoHelperFound:
> +                break
> +
> +        self.reader.close()
> +        print('Parsed description of %d helper function(s)' % len(self.helpers),
> +              file=sys.stderr)
> +
> +###############################################################################
> +
> +class Printer(object):
> +    """
> +    A generic class for printers. Printers should be created with an array of
> +    Helper objects, and implement a way to print them in the desired fashion.
> +    @helpers: array of Helper objects to print to standard output
> +    """
> +    def __init__(self, helpers):
> +        self.helpers = helpers
> +
> +    def print_header(self):
> +        pass
> +
> +    def print_footer(self):
> +        pass
> +
> +    def print_one(self, helper):
> +        pass
> +
> +    def print_all(self):
> +        self.print_header()
> +        for helper in self.helpers:
> +            self.print_one(helper)
> +        self.print_footer()
> +
> +class PrinterRST(Printer):
> +    """
> +    A printer for dumping collected information about helpers as a ReStructured
> +    Text page compatible with the rst2man program, which can be used to
> +    generate a manual page for the helpers.
> +    @helpers: array of Helper objects to print to standard output
> +    """
> +    def print_header(self):
> +        header = '''\
> +.. Copyright (C) 2018 Netronome Systems, Inc.

I think would be good to capture copyrights of all authors that added
the helpers being documented. Since a lot of text was copied from commit
logs it's only fair to preserve the copyrights.
Such man page file is automatically generated by the python script
and script itself is copyrighted by Netronome. That's fine, but the text
of man page is not netronome only.
I'm not sure what would be the solution. May be something like:
"
Copyright (C) All BPF authors and contributors from 2011 to present
See git log include/uapi/linux/bpf.h for details
"
?

> +.. 
> +.. %%%LICENSE_START(VERBATIM)
> +.. Permission is granted to make and distribute verbatim copies of this
> +.. manual provided the copyright notice and this permission notice are
> +.. preserved on all copies.
> +.. 
> +.. Permission is granted to copy and distribute modified versions of this
> +.. manual under the conditions for verbatim copying, provided that the
> +.. entire resulting derived work is distributed under the terms of a
> +.. permission notice identical to this one.
> +.. 
> +.. Since the Linux kernel and libraries are constantly changing, this
> +.. manual page may be incorrect or out-of-date.  The author(s) assume no
> +.. responsibility for errors or omissions, or for damages resulting from
> +.. the use of the information contained herein.  The author(s) may not
> +.. have taken the same level of care in the production of this manual,
> +.. which is licensed free of charge, as they might when working
> +.. professionally.
> +.. 
> +.. Formatted or processed versions of this manual, if unaccompanied by
> +.. the source, must acknowledge the copyright and authors of this work.
> +.. %%%LICENSE_END
> +.. 
> +.. Please do not edit this file. It was generated from the documentation
> +.. located in file include/uapi/linux/bpf.h of the Linux kernel sources
> +.. (helpers description), and from scripts/bpf_helpers_doc.py in the same
> +.. repository (header and footer).
> +
> +===========
> +BPF-HELPERS
> +===========
> +-------------------------------------------------------------------------------
> +list of eBPF helper functions
> +-------------------------------------------------------------------------------
> +
> +:Manual section: 7
> +
> +DESCRIPTION
> +===========
> +
> +The extended Berkeley Packet Filter (eBPF) subsystem consists in programs
> +written in a pseudo-assembly language, then attached to one of the several
> +kernel hooks and run in reaction of specific events. This framework differs
> +from the older, "classic" BPF (or "cBPF") in several aspects, one of them being
> +the ability to call special functions (or "helpers") from within a program. For
> +security reasons, these functions are restricted to a white-list of helpers
> +defined in the kernel.

'for security reasons' sounds a bit odd. May be 'for safety reasons' ?
Or drop that part.

> +
> +These helpers are used by eBPF programs to interact with the system, or with
> +the context in which they work. For instance, they can be used to print
> +debugging messages, to get the time since the system was booted, to interact
> +with eBPF maps, or to manipulate network packets metadata. Since there are

s/packets metadata/packets/

> +several eBPF program types, and that they do not run in the same context, each
> +program type can only call a subset of those helpers.
> +
> +Due to eBPF conventions, a helper can not have more than five arguments.
> +
> +This document is an attempt to list and document the helpers available to eBPF
> +developers. They are sorted by chronological order (the oldest helpers in the
> +kernel at the top).
> +
> +HELPERS
> +=======
> +'''
> +        print(header)
> +
> +    def print_footer(self):
> +        footer = '''
> +NOTES
> +=====
> +
> +On the performance side, eBPF programs move to the stack all arguments to pass
> +to the helpers, and call directly into the compiled helper functions without

"move to the stack all arguments" ?! I'm not sure what you're trying to say.
The arguments stay in registers for the call.

> +requiring any foreign-function interface. As a result, calling helpers
> +introduce very little overhead.

not true. it's zero overhead. Literally. Very little is not the same as zero.

> +
> +EXAMPLES
> +========
> +
> +Example usage for most of the eBPF helpers listed in this manual page are
> +available within the Linux kernel sources, at the following locations:
> +
> +* *samples/bpf/*
> +* *tools/testing/selftests/bpf/*
> +
> +IMPLEMENTATION
> +==============
> +
> +This manual page is an effort to document the existing eBPF helper functions.
> +But as of this writing, the BPF sub-system is under heavy development. New eBPF
> +program or map types are added, along with new helper functions. Some helpers
> +are occasionally made available for additional program types. So in spite of
> +the efforts of the community, this page might not be up-to-date. If you want to
> +check by yourself what helper functions exist in your kernel, or what types of
> +programs they can support, here are some files among the kernel tree that you
> +may be interested in:
> +
> +* *include/uapi/linux/bpf.h* contains the full list of all helper functions.
> +* *net/core/filter.c* contains the definition of most network-related helper
> +  functions, and the list of program types from which they can be used.
> +* *kernel/trace/bpf_trace.c* is the equivalent for most tracing program-related
> +  helpers.
> +* *kernel/bpf/verifier.c* contains the functions used to check that valid types
> +  of eBPF maps are used with a given helper function.
> +* *kernel/bpf/* directory contains other files in which additional helpers are
> +  defined (for cgroups, sockmaps, etc.).
> +
> +Compatibility between helper functions and program types can generally be found
> +in the files where helper functions are defined. Look for the **struct
> +bpf_func_proto** objects and for functions returning them: these functions
> +contain a list of helpers that a given program type can call. Note that the
> +**default:** label of the **switch ... case** used to filter helpers can call
> +other functions, themselves allowing access to additional helpers. The
> +requirement for GPL license is also in those **struct bpf_func_proto**.

I think here would be good to add that most networking helpers are non-GPL
because they operate on packets which are abstract bytes on the wire,
whereas most tracing helpers are GPL, since they inspect the guts of
the linux kernel which is GPL itself.
That's the main reason why adding extra 'gpl=yes/no' for each helper
description is redundant.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] perf: riscv: preliminary RISC-V support
From: Alex Solomatnikov @ 2018-04-10 18:15 UTC (permalink / raw)
  To: Alan Kao
  Cc: Palmer Dabbelt, Albert Ou, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Jonathan Corbet, linux-riscv, linux-doc,
	linux-kernel, Nick Hu, Greentime Hu
In-Reply-To: <20180409070710.GA3844@andestech.com>

Alan,

I merged SBI emulation for perf counters and config:
https://github.com/riscv/riscv-pk/pull/98

You should be able to write these CSRs.

Thanks,
Alex

On Mon, Apr 9, 2018 at 12:07 AM, Alan Kao <alankao@andestech.com> wrote:
> On Thu, Apr 05, 2018 at 09:47:50AM -0700, Palmer Dabbelt wrote:
>> On Mon, 26 Mar 2018 00:57:54 PDT (-0700), alankao@andestech.com wrote:
>> >This patch provide a basic PMU, riscv_base_pmu, which supports two
>> >general hardware event, instructions and cycles.  Furthermore, this
>> >PMU serves as a reference implementation to ease the portings in
>> >the future.
>> >
>> >riscv_base_pmu should be able to run on any RISC-V machine that
>> >conforms to the Priv-Spec.  Note that the latest qemu model hasn't
>> >fully support a proper behavior of Priv-Spec 1.10 yet, but work
>> >around should be easy with very small fixes.  Please check
>> >https://github.com/riscv/riscv-qemu/pull/115 for future updates.
>> >
>> >Cc: Nick Hu <nickhu@andestech.com>
>> >Cc: Greentime Hu <greentime@andestech.com>
>> >Signed-off-by: Alan Kao <alankao@andestech.com>
>>
>> We should really be able to detect PMU types at runtime (via a device tree
>> entry) rather than requiring that a single PMU is built in to the kernel.
>> This will require a handful of modifications to how this patch works, which
>> I'll try to list below.
>
>> >+menu "PMU type"
>> >+    depends on PERF_EVENTS
>> >+
>> >+config RISCV_BASE_PMU
>> >+    bool "Base Performance Monitoring Unit"
>> >+    def_bool y
>> >+    help
>> >+      A base PMU that serves as a reference implementation and has limited
>> >+      feature of perf.
>> >+
>> >+endmenu
>> >+
>>
>> Rather than a menu where a single PMU can be selected, there should be
>> options to enable or disable support for each PMU type -- this is just like
>> how all our other drivers work.
>>
>
> I see.  Sure.  The descriptions and implementation will be refined in v3.
>
>> >+struct pmu * __weak __init riscv_init_platform_pmu(void)
>> >+{
>> >+    riscv_pmu = &riscv_base_pmu;
>> >+    return riscv_pmu->pmu;
>> >+}
>>
>> Rather than relying on a weak symbol that gets overridden by other PMU
>> types, this should look through the device tree for a compatible PMU (in the
>> case of just the base PMU it could be any RISC-V hart) and install a PMU
>> handler for it.  There'd probably be some sort of priority scheme here, like
>> there are for other driver subsystems, where we'd pick the best PMU driver
>> that's compatible with the PMUs on every hart.
>>
>> >+
>> >+int __init init_hw_perf_events(void)
>> >+{
>> >+    struct pmu *pmu = riscv_init_platform_pmu();
>> >+
>> >+    perf_irq = NULL;
>> >+    perf_pmu_register(pmu, "cpu", PERF_TYPE_RAW);
>> >+    return 0;
>> >+}
>> >+arch_initcall(init_hw_perf_events);
>>
>> Since we only have a single PMU type right now this isn't critical to handle
>> right away, but we will have to refactor this before adding another PMU.
>
> I see.  My rough plan is to do the device tree parsing here, and if no specific
> PMU string is found then just register the base PMU proposed in this patch.
> How about this idea?
>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC bpf-next v2 2/8] bpf: add documentation for eBPF helpers (01-11)
From: Alexei Starovoitov @ 2018-04-10 17:56 UTC (permalink / raw)
  To: Quentin Monnet; +Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man
In-Reply-To: <20180410144157.4831-3-quentin.monnet@netronome.com>

On Tue, Apr 10, 2018 at 03:41:51PM +0100, Quentin Monnet wrote:
> Add documentation for eBPF helper functions to bpf.h user header file.
> This documentation can be parsed with the Python script provided in
> another commit of the patch series, in order to provide a RST document
> that can later be converted into a man page.
> 
> The objective is to make the documentation easily understandable and
> accessible to all eBPF developers, including beginners.
> 
> This patch contains descriptions for the following helper functions, all
> written by Alexei:
> 
> - bpf_map_lookup_elem()
> - bpf_map_update_elem()
> - bpf_map_delete_elem()
> - bpf_probe_read()
> - bpf_ktime_get_ns()
> - bpf_trace_printk()
> - bpf_skb_store_bytes()
> - bpf_l3_csum_replace()
> - bpf_l4_csum_replace()
> - bpf_tail_call()
> - bpf_clone_redirect()
> 
> Cc: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
>  include/uapi/linux/bpf.h | 199 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 199 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 45f77f01e672..2bc653a3a20f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -381,6 +381,205 @@ union bpf_attr {
>   * intentional, removing them would break paragraphs for rst2man.
>   *
>   * Start of BPF helper function descriptions:
> + *
> + * void *bpf_map_lookup_elem(struct bpf_map *map, void *key)
> + * 	Description
> + * 		Perform a lookup in *map* for an entry associated to *key*.
> + * 	Return
> + * 		Map value associated to *key*, or **NULL** if no entry was
> + * 		found.
> + *
> + * int bpf_map_update_elem(struct bpf_map *map, void *key, void *value, u64 flags)
> + * 	Description
> + * 		Add or update the value of the entry associated to *key* in
> + * 		*map* with *value*. *flags* is one of:
> + *
> + * 		**BPF_NOEXIST**
> + * 			The entry for *key* must not exist in the map.
> + * 		**BPF_EXIST**
> + * 			The entry for *key* must already exist in the map.
> + * 		**BPF_ANY**
> + * 			No condition on the existence of the entry for *key*.
> + *
> + * 		These flags are only useful for maps of type
> + * 		**BPF_MAP_TYPE_HASH**. For all other map types, **BPF_ANY**
> + * 		should be used.

I think that's not entirely accurate.
The flags work as expected for all other map types as well
and for lru map, sockmap, map in map the flags have practical use cases.

> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_map_delete_elem(struct bpf_map *map, void *key)
> + * 	Description
> + * 		Delete entry with *key* from *map*.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_probe_read(void *dst, u32 size, const void *src)
> + * 	Description
> + * 		For tracing programs, safely attempt to read *size* bytes from
> + * 		address *src* and store the data in *dst*.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * u64 bpf_ktime_get_ns(void)
> + * 	Description
> + * 		Return the time elapsed since system boot, in nanoseconds.
> + * 	Return
> + * 		Current *ktime*.
> + *
> + * int bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
> + * 	Description
> + * 		This helper is a "printk()-like" facility for debugging. It
> + * 		prints a message defined by format *fmt* (of size *fmt_size*)
> + * 		to file *\/sys/kernel/debug/tracing/trace* from DebugFS, if
> + * 		available. It can take up to three additional **u64**
> + * 		arguments (as an eBPF helpers, the total number of arguments is
> + * 		limited to five). Each time the helper is called, it appends a
> + * 		line that looks like the following:
> + *
> + * 		::
> + *
> + * 			telnet-470   [001] .N.. 419421.045894: 0x00000001: BPF command: 2
> + *
> + * 		In the above:
> + *
> + * 			* ``telnet`` is the name of the current task.
> + * 			* ``470`` is the PID of the current task.
> + * 			* ``001`` is the CPU number on which the task is
> + * 			  running.
> + * 			* In ``.N..``, each character refers to a set of
> + * 			  options (whether irqs are enabled, scheduling
> + * 			  options, whether hard/softirqs are running, level of
> + * 			  preempt_disabled respectively). **N** means that
> + * 			  **TIF_NEED_RESCHED** and **PREEMPT_NEED_RESCHED**
> + * 			  are set.
> + * 			* ``419421.045894`` is a timestamp.
> + * 			* ``0x00000001`` is a fake value used by BPF for the
> + * 			  instruction pointer register.
> + * 			* ``BPF command: 2`` is the message formatted with
> + * 			  *fmt*.

the above depends on how trace_pipe was configured. It's a default
configuration for many, but would be good to explain this a bit better.

> + *
> + * 		The conversion specifiers supported by *fmt* are similar, but
> + * 		more limited than for printk(). They are **%d**, **%i**,
> + * 		**%u**, **%x**, **%ld**, **%li**, **%lu**, **%lx**, **%lld**,
> + * 		**%lli**, **%llu**, **%llx**, **%p**, **%s**. No modifier (size
> + * 		of field, padding with zeroes, etc.) is available, and the
> + * 		helper will silently fail if it encounters an unknown
> + * 		specifier.

This is not true. bpf_trace_printk will return -EINVAL for unknown specifier.

> + *
> + * 		Also, note that **bpf_trace_printk**\ () is slow, and should
> + * 		only be used for debugging purposes. For passing values to user
> + * 		space, perf events should be preferred.

please mention the giant dmesg warning that people will definitely
notice when they try to use this helper.

> + * 	Return
> + * 		The number of bytes written to the buffer, or a negative error
> + * 		in case of failure.
> + *
> + * int bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void *from, u32 len, u64 flags)
> + * 	Description
> + * 		Store *len* bytes from address *from* into the packet
> + * 		associated to *skb*, at *offset*. *flags* are a combination of
> + * 		**BPF_F_RECOMPUTE_CSUM** (automatically recompute the
> + * 		checksum for the packet after storing the bytes) and
> + * 		**BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
> + * 		**->swhash** and *skb*\ **->l4hash** to 0).
> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_l3_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64 to, u64 size)
> + * 	Description
> + * 		Recompute the IP checksum for the packet associated to *skb*.
> + * 		Computation is incremental, so the helper must know the former
> + * 		value of the header field that was modified (*from*), the new
> + * 		value of this field (*to*), and the number of bytes (2 or 4)
> + * 		for this field, stored in *size*. Alternatively, it is possible
> + * 		to store the difference between the previous and the new values
> + * 		of the header field in *to*, by setting *from* and *size* to 0.
> + * 		For both methods, *offset* indicates the location of the IP
> + * 		checksum within the packet.
> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_l4_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64 to, u64 flags)
> + * 	Description
> + * 		Recompute the TCP or UDP checksum for the packet associated to
> + * 		*skb*. Computation is incremental, so the helper must know the
> + * 		former value of the header field that was modified (*from*),
> + * 		the new value of this field (*to*), and the number of bytes (2
> + * 		or 4) for this field, stored on the lowest four bits of
> + * 		*flags*. Alternatively, it is possible to store the difference
> + * 		between the previous and the new values of the header field in
> + * 		*to*, by setting *from* and the four lowest bits of *flags* to
> + * 		0. For both methods, *offset* indicates the location of the IP
> + * 		checksum within the packet. In addition to the size of the
> + * 		field, *flags* can be added (bitwise OR) actual flags. With
> + * 		**BPF_F_MARK_MANGLED_0**, a null checksum is left untouched
> + * 		(unless **BPF_F_MARK_ENFORCE** is added as well), and for
> + * 		updates resulting in a null checksum the value is set to
> + * 		**CSUM_MANGLED_0** instead. Flag **BPF_F_PSEUDO_HDR**
> + * 		indicates the checksum is to be computed against a
> + * 		pseudo-header.
> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_tail_call(void *ctx, struct bpf_map *prog_array_map, u32 index)
> + * 	Description
> + * 		This special helper is used to trigger a "tail call", or in
> + * 		other words, to jump into another eBPF program. The contents of
> + * 		eBPF registers and stack are not modified, the new program
> + * 		"inherits" them from the caller. This mechanism allows for

"inherits" is a technically correct, but misleading statement,
since callee program cannot access caller's registers and stack.

> + * 		program chaining, either for raising the maximum number of
> + * 		available eBPF instructions, or to execute given programs in
> + * 		conditional blocks. For security reasons, there is an upper
> + * 		limit to the number of successive tail calls that can be
> + * 		performed.
> + *
> + * 		Upon call of this helper, the program attempts to jump into a
> + * 		program referenced at index *index* in *prog_array_map*, a
> + * 		special map of type **BPF_MAP_TYPE_PROG_ARRAY**, and passes
> + * 		*ctx*, a pointer to the context.
> + *
> + * 		If the call succeeds, the kernel immediately runs the first
> + * 		instruction of the new program. This is not a function call,
> + * 		and it never goes back to the previous program. If the call
> + * 		fails, then the helper has no effect, and the caller continues
> + * 		to run its own instructions. A call can fail if the destination
> + * 		program for the jump does not exist (i.e. *index* is superior
> + * 		to the number of entries in *prog_array_map*), or if the
> + * 		maximum number of tail calls has been reached for this chain of
> + * 		programs. This limit is defined in the kernel by the macro
> + * 		**MAX_TAIL_CALL_CNT** (not accessible to user space), which
> + * 		is currently set to 32.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
> + *
> + * int bpf_clone_redirect(struct sk_buff *skb, u32 ifindex, u64 flags)
> + * 	Description
> + * 		Clone and redirect the packet associated to *skb* to another
> + * 		net device of index *ifindex*. The only flag supported for now
> + * 		is **BPF_F_INGRESS**, which indicates the packet is to be
> + * 		redirected to the ingress interface instead of (by default)
> + * 		egress.

imo the above sentence is prone to misinterpretation.
Can you rephrase it to say that both redirect to ingress and redirect to egress
are supported and flag is used to indicate which path to take ?

> + *
> + * 		A call to this helper is susceptible to change data from the
> + * 		packet. Therefore, at load time, all checks on pointers
> + * 		previously done by the verifier are invalidated and must be
> + * 		performed again.
> + * 	Return
> + * 		0 on success, or a negative error in case of failure.
>   */
>  #define __BPF_FUNC_MAPPER(FN)		\
>  	FN(unspec),			\
> -- 
> 2.14.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC bpf-next v2 7/8] bpf: add documentation for eBPF helpers (51-57)
From: Andrey Ignatov @ 2018-04-10 17:50 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man,
	Lawrence Brakmo, Yonghong Song, Josef Bacik
In-Reply-To: <20180410144157.4831-8-quentin.monnet@netronome.com>

Quentin Monnet <quentin.monnet@netronome.com> [Tue, 2018-04-10 07:43 -0700]:
> + * int bpf_bind(struct bpf_sock_addr_kern *ctx, struct sockaddr *addr, int addr_len)
> + * 	Description
> + * 		Bind the socket associated to *ctx* to the address pointed by
> + * 		*addr*, of length *addr_len*. This allows for making outgoing
> + * 		connection from the desired IP address, which can be useful for
> + * 		example when all processes inside a cgroup should use one
> + * 		single IP address on a host that has multiple IP configured.
> + *
> + * 		This helper works for IPv4 and IPv6, TCP and UDP sockets. The
> + * 		domain (*addr*\ **->sa_family**) must be **AF_INET** (or
> + * 		**AF_INET6**). Looking for a free port to bind to can be
> + * 		expensive, therefore binding to port is not permitted by the
> + * 		helper: *addr*\ **->sin_port** (or **sin6_port**, respectively)
> + * 		must be set to zero.
> + *
> + * 		As for the remote end, both parts of it can be overridden,
> + * 		remote IP and remote port. This can be useful if an application
> + * 		inside a cgroup wants to connect to another application inside
> + * 		the same cgroup or to itself, but knows nothing about the IP
> + * 		address assigned to the cgroup.

The last paragraph ("As for the remote end ...") is not relevant to
bpf_bind() and should be removed. It's about sys_connect hook itself
that can call to bpf_bind() but also has other functionality (and that
other functionality is described by this paragraph).


-- 
Andrey Ignatov
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [RFC 01/10] PCI: dwc: Add MSI-X callbacks handler
From: Gustavo Pimentel @ 2018-04-10 17:14 UTC (permalink / raw)
  To: bhelgaas, lorenzo.pieralisi, Joao.Pinto, jingoohan1, kishon,
	adouglas, niklas.cassel, jesper.nilsson
  Cc: linux-pci, linux-doc, linux-kernel, gustavo.pimentel
In-Reply-To: <cover.1523379766.git.gustavo.pimentel@synopsys.com>

Changes the pcie_raise_irq function signature, namely the interrupt_num
variable type from u8 to u16 to accommodate the MSI-X maximum interrupts
of 2048.

Implements a PCIe config space capability iterator function to search and
save the MSI and MSI-X pointers. With this method the code becomes more
generic and flexible.

Implements MSI-X set/get functions for sysfs interface in order to change
the EP entries number.

Implements EP MSI-X interface for triggering interruptions.

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
---
 drivers/pci/dwc/pci-dra7xx.c           |   2 +-
 drivers/pci/dwc/pcie-artpec6.c         |   2 +-
 drivers/pci/dwc/pcie-designware-ep.c   | 145 ++++++++++++++++++++++++++++++++-
 drivers/pci/dwc/pcie-designware-plat.c |   6 +-
 drivers/pci/dwc/pcie-designware.h      |  23 +++++-
 5 files changed, 173 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index ed8558d..5265725 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -369,7 +369,7 @@ static void dra7xx_pcie_raise_msi_irq(struct dra7xx_pcie *dra7xx,
 }
 
 static int dra7xx_pcie_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
-				 enum pci_epc_irq_type type, u8 interrupt_num)
+				 enum pci_epc_irq_type type, u16 interrupt_num)
 {
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
diff --git a/drivers/pci/dwc/pcie-artpec6.c b/drivers/pci/dwc/pcie-artpec6.c
index e66cede..96dc259 100644
--- a/drivers/pci/dwc/pcie-artpec6.c
+++ b/drivers/pci/dwc/pcie-artpec6.c
@@ -428,7 +428,7 @@ static void artpec6_pcie_ep_init(struct dw_pcie_ep *ep)
 }
 
 static int artpec6_pcie_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
-				  enum pci_epc_irq_type type, u8 interrupt_num)
+				  enum pci_epc_irq_type type, u16 interrupt_num)
 {
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 
diff --git a/drivers/pci/dwc/pcie-designware-ep.c b/drivers/pci/dwc/pcie-designware-ep.c
index 15b22a6..874d4c2 100644
--- a/drivers/pci/dwc/pcie-designware-ep.c
+++ b/drivers/pci/dwc/pcie-designware-ep.c
@@ -40,6 +40,44 @@ void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)
 	__dw_pcie_ep_reset_bar(pci, bar, 0);
 }
 
+void dw_pcie_ep_find_cap_addr(struct dw_pcie_ep *ep)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	u8 next_ptr, curr_ptr, cap_id;
+	u16 reg;
+
+	memset(&ep->cap_addr, 0, sizeof(ep->cap_addr));
+
+	reg = dw_pcie_readw_dbi(pci, PCI_STATUS);
+	if (!(reg & PCI_STATUS_CAP_LIST))
+		return;
+
+	reg = dw_pcie_readw_dbi(pci, PCI_CAPABILITY_LIST);
+	next_ptr = (reg & 0x00ff);
+	if (!next_ptr)
+		return;
+
+	reg = dw_pcie_readw_dbi(pci, next_ptr);
+	curr_ptr = next_ptr;
+	next_ptr = (reg & 0xff00) >> 8;
+	cap_id = (reg & 0x00ff);
+
+	while (next_ptr && (cap_id <= PCI_CAP_ID_MAX)) {
+		switch (cap_id) {
+		case PCI_CAP_ID_MSI:
+			ep->cap_addr.msi_addr = curr_ptr;
+			break;
+		case PCI_CAP_ID_MSIX:
+			ep->cap_addr.msix_addr = curr_ptr;
+			break;
+		}
+		reg = dw_pcie_readw_dbi(pci, next_ptr);
+		curr_ptr = next_ptr;
+		next_ptr = (reg & 0xff00) >> 8;
+		cap_id = (reg & 0x00ff);
+	}
+}
+
 static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
 				   struct pci_epf_header *hdr)
 {
@@ -241,8 +279,47 @@ static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 encode_int)
 	return 0;
 }
 
+static int dw_pcie_ep_get_msix(struct pci_epc *epc, u8 func_no)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	u32 val, reg;
+
+	if (ep->cap_addr.msix_addr == 0)
+		return 0;
+
+	reg = ep->cap_addr.msix_addr + PCI_MSIX_FLAGS;
+	val = dw_pcie_readw_dbi(pci, reg);
+	if (!(val & PCI_MSIX_FLAGS_ENABLE))
+		return -EINVAL;
+
+	val &= PCI_MSIX_FLAGS_QSIZE;
+
+	return val;
+}
+
+static int dw_pcie_ep_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	u32 val, reg;
+
+	if (ep->cap_addr.msix_addr == 0)
+		return 0;
+
+	reg = ep->cap_addr.msix_addr + PCI_MSIX_FLAGS;
+	val = dw_pcie_readw_dbi(pci, reg);
+	val &= ~PCI_MSIX_FLAGS_QSIZE;
+	val |= interrupts;
+	dw_pcie_dbi_ro_wr_en(pci);
+	dw_pcie_writew_dbi(pci, reg, val);
+	dw_pcie_dbi_ro_wr_dis(pci);
+
+	return 0;
+}
+
 static int dw_pcie_ep_raise_irq(struct pci_epc *epc, u8 func_no,
-				enum pci_epc_irq_type type, u8 interrupt_num)
+				enum pci_epc_irq_type type, u16 interrupt_num)
 {
 	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
 
@@ -282,6 +359,8 @@ static const struct pci_epc_ops epc_ops = {
 	.unmap_addr		= dw_pcie_ep_unmap_addr,
 	.set_msi		= dw_pcie_ep_set_msi,
 	.get_msi		= dw_pcie_ep_get_msi,
+	.set_msix		= dw_pcie_ep_set_msix,
+	.get_msix		= dw_pcie_ep_get_msix,
 	.raise_irq		= dw_pcie_ep_raise_irq,
 	.start			= dw_pcie_ep_start,
 	.stop			= dw_pcie_ep_stop,
@@ -322,6 +401,60 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 	return 0;
 }
 
+int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
+			     u16 interrupt_num)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	struct pci_epc *epc = ep->epc;
+	u16 tbl_offset, bir;
+	u32 bar_addr_upper, bar_addr_lower;
+	u32 msg_addr_upper, msg_addr_lower;
+	u32 reg, msg_data;
+	u64 tbl_addr, msg_addr, reg_u64;
+	void __iomem *msix_tbl;
+	int ret;
+
+	reg = ep->cap_addr.msix_addr + PCI_MSIX_TABLE;
+	tbl_offset = dw_pcie_readl_dbi(pci, reg);
+	bir = (tbl_offset & PCI_MSIX_TABLE_BIR);
+	tbl_offset &= PCI_MSIX_TABLE_OFFSET;
+	tbl_offset >>= 3;
+
+	reg = PCI_BASE_ADDRESS_0 + (4 * bir);
+	bar_addr_lower = dw_pcie_readl_dbi(pci, reg);
+	reg_u64 = (bar_addr_lower & PCI_BASE_ADDRESS_MEM_TYPE_MASK);
+	if (reg_u64 == PCI_BASE_ADDRESS_MEM_TYPE_64)
+		bar_addr_upper = dw_pcie_readl_dbi(pci, reg + 4);
+	else
+		bar_addr_upper = 0;
+
+	tbl_addr = ((u64) bar_addr_upper) << 32 | bar_addr_lower;
+	tbl_addr += (tbl_offset + ((interrupt_num - 1) * PCI_MSIX_ENTRY_SIZE));
+	tbl_addr &= PCI_BASE_ADDRESS_MEM_MASK;
+
+	msix_tbl = ioremap_nocache(ep->phys_base + tbl_addr, ep->addr_size);
+	if (!msix_tbl)
+		return -EINVAL;
+
+	msg_addr_lower = readl(msix_tbl + PCI_MSIX_ENTRY_LOWER_ADDR);
+	msg_addr_upper = readl(msix_tbl + PCI_MSIX_ENTRY_UPPER_ADDR);
+	msg_addr = ((u64) msg_addr_upper) << 32 | msg_addr_lower;
+	msg_data = readl(msix_tbl + PCI_MSIX_ENTRY_DATA);
+
+	iounmap(msix_tbl);
+
+	ret = dw_pcie_ep_map_addr(epc, func_no, ep->msix_mem_phys, msg_addr,
+				  epc->mem->page_size);
+	if (ret)
+		return ret;
+
+	writel(msg_data, ep->msix_mem);
+
+	dw_pcie_ep_unmap_addr(epc, func_no, ep->msix_mem_phys);
+
+	return 0;
+}
+
 void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
 {
 	struct pci_epc *epc = ep->epc;
@@ -329,6 +462,9 @@ void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
 	pci_epc_mem_free_addr(epc, ep->msi_mem_phys, ep->msi_mem,
 			      epc->mem->page_size);
 
+	pci_epc_mem_free_addr(epc, ep->msix_mem_phys, ep->msix_mem,
+			      epc->mem->page_size);
+
 	pci_epc_mem_exit(epc);
 }
 
@@ -411,6 +547,13 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
 		return -ENOMEM;
 	}
 
+	ep->msix_mem = pci_epc_mem_alloc_addr(epc, &ep->msix_mem_phys,
+					     epc->mem->page_size);
+	if (!ep->msix_mem) {
+		dev_err(dev, "Failed to reserve memory for MSI-\n");
+		return -ENOMEM;
+	}
+
 	ep->epc = epc;
 	epc_set_drvdata(epc, ep);
 	dw_pcie_setup(pci);
diff --git a/drivers/pci/dwc/pcie-designware-plat.c b/drivers/pci/dwc/pcie-designware-plat.c
index 2bad68d..c3a4707 100644
--- a/drivers/pci/dwc/pcie-designware-plat.c
+++ b/drivers/pci/dwc/pcie-designware-plat.c
@@ -74,11 +74,13 @@ static void dw_plat_pcie_ep_init(struct dw_pcie_ep *ep)
 
 	for (bar = BAR_0; bar <= BAR_5; bar++)
 		dw_pcie_ep_reset_bar(pci, bar);
+
+	dw_pcie_ep_find_cap_addr(ep);
 }
 
 static int dw_plat_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
 				     enum pci_epc_irq_type type,
-				     u8 interrupt_num)
+				     u16 interrupt_num)
 {
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 
@@ -88,6 +90,8 @@ static int dw_plat_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
 		return -EINVAL;
 	case PCI_EPC_IRQ_MSI:
 		return dw_pcie_ep_raise_msi_irq(ep, func_no, interrupt_num);
+	case PCI_EPC_IRQ_MSIX:
+		return dw_pcie_ep_raise_msix_irq(ep, func_no, interrupt_num);
 	default:
 		dev_err(pci->dev, "UNKNOWN IRQ type\n");
 	}
diff --git a/drivers/pci/dwc/pcie-designware.h b/drivers/pci/dwc/pcie-designware.h
index bee4e25..456fd94 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -191,7 +191,12 @@ enum dw_pcie_as_type {
 struct dw_pcie_ep_ops {
 	void	(*ep_init)(struct dw_pcie_ep *ep);
 	int	(*raise_irq)(struct dw_pcie_ep *ep, u8 func_no,
-			     enum pci_epc_irq_type type, u8 interrupt_num);
+			     enum pci_epc_irq_type type, u16 interrupt_num);
+};
+
+struct dw_pcie_cap_addr {
+	u8	msi_addr;
+	u8	msix_addr;
 };
 
 struct dw_pcie_ep {
@@ -208,6 +213,9 @@ struct dw_pcie_ep {
 	u32			num_ob_windows;
 	void __iomem		*msi_mem;
 	phys_addr_t		msi_mem_phys;
+	void __iomem		*msix_mem;
+	phys_addr_t		msix_mem_phys;
+	struct dw_pcie_cap_addr	cap_addr;
 };
 
 struct dw_pcie_ops {
@@ -359,7 +367,10 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep);
 void dw_pcie_ep_exit(struct dw_pcie_ep *ep);
 int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 			     u8 interrupt_num);
+int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
+			     u16 interrupt_num);
 void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar);
+void dw_pcie_ep_find_cap_addr(struct dw_pcie_ep *ep);
 #else
 static inline void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
 {
@@ -380,8 +391,18 @@ static inline int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 	return 0;
 }
 
+static inline int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
+					   u16 interrupt_num)
+{
+	return 0;
+}
+
 static inline void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)
 {
 }
+
+static inline void dw_pcie_ep_find_cap_addr(struct dw_pcie_ep *ep)
+{
+}
 #endif
 #endif /* _PCIE_DESIGNWARE_H */
-- 
2.7.4


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [RFC 05/10] PCI: dwc: Add legacy interrupt callback handler
From: Gustavo Pimentel @ 2018-04-10 17:14 UTC (permalink / raw)
  To: bhelgaas, lorenzo.pieralisi, Joao.Pinto, jingoohan1, kishon,
	adouglas, niklas.cassel, jesper.nilsson
  Cc: linux-pci, linux-doc, linux-kernel, gustavo.pimentel
In-Reply-To: <cover.1523379766.git.gustavo.pimentel@synopsys.com>

Adds a legacy interrupt callback handler. Currently Designware IP doesn't
allow triggering the legacy interrupt.

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
---
 drivers/pci/dwc/pcie-designware-ep.c   | 10 ++++++++++
 drivers/pci/dwc/pcie-designware-plat.c |  3 +--
 drivers/pci/dwc/pcie-designware.h      |  6 ++++++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware-ep.c b/drivers/pci/dwc/pcie-designware-ep.c
index e352786..fb55259 100644
--- a/drivers/pci/dwc/pcie-designware-ep.c
+++ b/drivers/pci/dwc/pcie-designware-ep.c
@@ -375,6 +375,16 @@ static const struct pci_epc_ops epc_ops = {
 	.stop			= dw_pcie_ep_stop,
 };
 
+int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	struct device *dev = pci->dev;
+
+	dev_err(dev, "EP cannot trigger legacy IRQs\n");
+
+	return -EINVAL;
+}
+
 int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 			     u8 interrupt_num)
 {
diff --git a/drivers/pci/dwc/pcie-designware-plat.c b/drivers/pci/dwc/pcie-designware-plat.c
index c3a4707..3874b02 100644
--- a/drivers/pci/dwc/pcie-designware-plat.c
+++ b/drivers/pci/dwc/pcie-designware-plat.c
@@ -86,8 +86,7 @@ static int dw_plat_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no,
 
 	switch (type) {
 	case PCI_EPC_IRQ_LEGACY:
-		dev_err(pci->dev, "EP cannot trigger legacy IRQs\n");
-		return -EINVAL;
+		return dw_pcie_ep_raise_legacy_irq(ep, func_no);
 	case PCI_EPC_IRQ_MSI:
 		return dw_pcie_ep_raise_msi_irq(ep, func_no, interrupt_num);
 	case PCI_EPC_IRQ_MSIX:
diff --git a/drivers/pci/dwc/pcie-designware.h b/drivers/pci/dwc/pcie-designware.h
index 2acf18b0..808b280 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -354,6 +354,7 @@ static inline int dw_pcie_allocate_domains(struct pcie_port *pp)
 void dw_pcie_ep_linkup(struct dw_pcie_ep *ep);
 int dw_pcie_ep_init(struct dw_pcie_ep *ep);
 void dw_pcie_ep_exit(struct dw_pcie_ep *ep);
+int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no);
 int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 			     u8 interrupt_num);
 int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
@@ -374,6 +375,11 @@ static inline void dw_pcie_ep_exit(struct dw_pcie_ep *ep)
 {
 }
 
+static inline int dw_pcie_ep_raise_legacy_irq(struct dw_pcie_ep *ep, u8 func_no)
+{
+	return 0;
+}
+
 static inline int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
 					   u8 interrupt_num)
 {
-- 
2.7.4


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [RFC 03/10] PCI: endpoint: Add MSI-X interfaces
From: Gustavo Pimentel @ 2018-04-10 17:14 UTC (permalink / raw)
  To: bhelgaas, lorenzo.pieralisi, Joao.Pinto, jingoohan1, kishon,
	adouglas, niklas.cassel, jesper.nilsson
  Cc: linux-pci, linux-doc, linux-kernel, gustavo.pimentel
In-Reply-To: <cover.1523379766.git.gustavo.pimentel@synopsys.com>

Implements the generic method for calling the get/set callbacks.

Adds the PCI_EPC_IRQ_MSIX type.

Adds the MSI-X callbacks signatures to the ops structure.

Adds sysfs interface for altering the number of MSI-X entries.

Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
---
 drivers/pci/endpoint/pci-ep-cfs.c   | 24 ++++++++++++++++
 drivers/pci/endpoint/pci-epc-core.c | 57 +++++++++++++++++++++++++++++++++++++
 include/linux/pci-epc.h             | 11 ++++++-
 include/linux/pci-epf.h             |  1 +
 4 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/endpoint/pci-ep-cfs.c b/drivers/pci/endpoint/pci-ep-cfs.c
index 018ea34..d1288a0 100644
--- a/drivers/pci/endpoint/pci-ep-cfs.c
+++ b/drivers/pci/endpoint/pci-ep-cfs.c
@@ -286,6 +286,28 @@ static ssize_t pci_epf_msi_interrupts_show(struct config_item *item,
 		       to_pci_epf_group(item)->epf->msi_interrupts);
 }
 
+static ssize_t pci_epf_msix_interrupts_store(struct config_item *item,
+					     const char *page, size_t len)
+{
+	u16 val;
+	int ret;
+
+	ret = kstrtou16(page, 0, &val);
+	if (ret)
+		return ret;
+
+	to_pci_epf_group(item)->epf->msix_interrupts = val;
+
+	return len;
+}
+
+static ssize_t pci_epf_msix_interrupts_show(struct config_item *item,
+					    char *page)
+{
+	return sprintf(page, "%d\n",
+		       to_pci_epf_group(item)->epf->msix_interrupts);
+}
+
 PCI_EPF_HEADER_R(vendorid)
 PCI_EPF_HEADER_W_u16(vendorid)
 
@@ -327,6 +349,7 @@ CONFIGFS_ATTR(pci_epf_, subsys_vendor_id);
 CONFIGFS_ATTR(pci_epf_, subsys_id);
 CONFIGFS_ATTR(pci_epf_, interrupt_pin);
 CONFIGFS_ATTR(pci_epf_, msi_interrupts);
+CONFIGFS_ATTR(pci_epf_, msix_interrupts);
 
 static struct configfs_attribute *pci_epf_attrs[] = {
 	&pci_epf_attr_vendorid,
@@ -340,6 +363,7 @@ static struct configfs_attribute *pci_epf_attrs[] = {
 	&pci_epf_attr_subsys_id,
 	&pci_epf_attr_interrupt_pin,
 	&pci_epf_attr_msi_interrupts,
+	&pci_epf_attr_msix_interrupts,
 	NULL,
 };
 
diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index b0ee427..294a383 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -218,6 +218,63 @@ int pci_epc_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts)
 EXPORT_SYMBOL_GPL(pci_epc_set_msi);
 
 /**
+ * pci_epc_get_msix() - get the number of MSI-X interrupt numbers allocated
+ * @epc: the EPC device to which MSI-X interrupts was requested
+ * @func_no: the endpoint function number in the EPC device
+ *
+ * Invoke to get the number of MSI-X interrupts allocated by the RC
+ */
+int pci_epc_get_msix(struct pci_epc *epc, u8 func_no)
+{
+	int interrupt;
+	unsigned long flags;
+
+	if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions)
+		return 0;
+
+	if (!epc->ops->get_msix)
+		return 0;
+
+	spin_lock_irqsave(&epc->lock, flags);
+	interrupt = epc->ops->get_msix(epc, func_no);
+	spin_unlock_irqrestore(&epc->lock, flags);
+
+	if (interrupt < 0)
+		return 0;
+
+	return interrupt++;
+}
+EXPORT_SYMBOL_GPL(pci_epc_get_msix);
+
+/**
+ * pci_epc_set_msix() - set the number of MSI-X interrupt numbers required
+ * @epc: the EPC device on which MSI-X has to be configured
+ * @func_no: the endpoint function number in the EPC device
+ * @interrupts: number of MSI-X interrupts required by the EPF
+ *
+ * Invoke to set the required number of MSI-X interrupts.
+ */
+int pci_epc_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
+{
+	int ret;
+	unsigned long flags;
+
+	if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions ||
+	    interrupts < 1 || interrupts > 2048)
+		return -EINVAL;
+
+	if (!epc->ops->set_msix)
+		return 0;
+
+	spin_lock_irqsave(&epc->lock, flags);
+	ret = epc->ops->set_msix(epc, func_no, interrupts - 1);
+	spin_unlock_irqrestore(&epc->lock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pci_epc_set_msix);
+
+/**
  * pci_epc_unmap_addr() - unmap CPU address from PCI address
  * @epc: the EPC device on which address is allocated
  * @func_no: the endpoint function number in the EPC device
diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
index af657ca..32e8961 100644
--- a/include/linux/pci-epc.h
+++ b/include/linux/pci-epc.h
@@ -17,6 +17,7 @@ enum pci_epc_irq_type {
 	PCI_EPC_IRQ_UNKNOWN,
 	PCI_EPC_IRQ_LEGACY,
 	PCI_EPC_IRQ_MSI,
+	PCI_EPC_IRQ_MSIX,
 };
 
 /**
@@ -30,6 +31,10 @@ enum pci_epc_irq_type {
  *	     capability register
  * @get_msi: ops to get the number of MSI interrupts allocated by the RC from
  *	     the MSI capability register
+ * @set_msix: ops to set the requested number of MSI-X interrupts in the
+ *	     MSI-X capability register
+ * @get_msix: ops to get the number of MSI-X interrupts allocated by the RC
+ *	     from the MSI-X capability register
  * @raise_irq: ops to raise a legacy or MSI interrupt
  * @start: ops to start the PCI link
  * @stop: ops to stop the PCI link
@@ -48,8 +53,10 @@ struct pci_epc_ops {
 			      phys_addr_t addr);
 	int	(*set_msi)(struct pci_epc *epc, u8 func_no, u8 interrupts);
 	int	(*get_msi)(struct pci_epc *epc, u8 func_no);
+	int	(*set_msix)(struct pci_epc *epc, u8 func_no, u16 interrupts);
+	int	(*get_msix)(struct pci_epc *epc, u8 func_no);
 	int	(*raise_irq)(struct pci_epc *epc, u8 func_no,
-			     enum pci_epc_irq_type type, u8 interrupt_num);
+			     enum pci_epc_irq_type type, u16 interrupt_num);
 	int	(*start)(struct pci_epc *epc);
 	void	(*stop)(struct pci_epc *epc);
 	struct module *owner;
@@ -136,6 +143,8 @@ void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no,
 			phys_addr_t phys_addr);
 int pci_epc_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts);
 int pci_epc_get_msi(struct pci_epc *epc, u8 func_no);
+int pci_epc_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts);
+int pci_epc_get_msix(struct pci_epc *epc, u8 func_no);
 int pci_epc_raise_irq(struct pci_epc *epc, u8 func_no,
 		      enum pci_epc_irq_type type, u8 interrupt_num);
 int pci_epc_start(struct pci_epc *epc);
diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
index f7d6f48..9bb1f31 100644
--- a/include/linux/pci-epf.h
+++ b/include/linux/pci-epf.h
@@ -119,6 +119,7 @@ struct pci_epf {
 	struct pci_epf_header	*header;
 	struct pci_epf_bar	bar[6];
 	u8			msi_interrupts;
+	u16			msix_interrupts;
 	u8			func_no;
 
 	struct pci_epc		*epc;
-- 
2.7.4


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox