[PATCH v6 0/5] Add HWMON support for DGFX

linux-hwmon.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 0/5] Add HWMON support for DGFX
@ 2023-09-25  8:18 Badal Nilawar
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

This series adds the hwmon support on xe driver for 
DGFX. This is ported from i915 hwmon. 

v3: Fix review comments (Matt Brost/Andi) 

v4:
  - Squashed "Add HWMON infrastructure" patch to "Expose power attributes"
  - Dropped changes related to disable PL1 to boost firmware loading. 
    Will handle it saperate patch/series.
  - Dropped changes related to gt specific energy attributes. 
    Will handle gt specific energy attributes in saperate patch/series with design
    change suggested by Guenter
  - Fix review comments (Andi/Guenter)

v5: Rebased and clean up

v6: 
  - Dropped XE_MISSING_CASE patch
  - Fix couple of review comments 

Badal Nilawar (5):
  drm/xe/hwmon: Expose power attributes
  drm/xe/hwmon: Expose card reactive critical power
  drm/xe/hwmon: Expose input voltage attribute
  drm/xe/hwmon: Expose hwmon energy attribute
  drm/xe/hwmon: Expose power1_max_interval

 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |  72 ++
 drivers/gpu/drm/xe/Makefile                   |   3 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   9 +
 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h      |  44 +
 drivers/gpu/drm/xe/xe_device.c                |   3 +
 drivers/gpu/drm/xe/xe_device_types.h          |   2 +
 drivers/gpu/drm/xe/xe_hwmon.c                 | 757 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_hwmon.h                 |  19 +
 drivers/gpu/drm/xe/xe_pcode.h                 |   5 +
 drivers/gpu/drm/xe/xe_pcode_api.h             |   7 +
 10 files changed, 921 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
 create mode 100644 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
 create mode 100644 drivers/gpu/drm/xe/xe_hwmon.c
 create mode 100644 drivers/gpu/drm/xe/xe_hwmon.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
@ 2023-09-25  8:18 ` Badal Nilawar
  2023-09-25  8:58   ` Andi Shyti
                     ` (3 more replies)
  2023-09-25  8:18 ` [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
                   ` (3 subsequent siblings)
  4 siblings, 4 replies; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

Expose Card reactive sustained (pl1) power limit as power_max and
card default power limit (tdp) as power_rated_max.

v2:
  - Fix review comments (Riana)
v3:
  - Use drmm_mutex_init (Matt Brost)
  - Print error value (Matt Brost)
  - Convert enums to uppercase (Matt Brost)
  - Avoid extra reg read in hwmon_is_visible function (Riana)
  - Use xe_device_assert_mem_access when applicable (Matt Brost)
  - Add intel-xe@lists.freedesktop.org in Documentation (Matt Brost)
v4:
  - Use prefix xe_hwmon prefix for all functions (Matt Brost/Andi)
  - %s/hwmon_reg/xe_hwmon_reg (Andi)
  - Fix review comments (Guenter/Andi)
v5:
  - Fix review comments (Riana)
v6:
  - Use drm_warn in default case (Rodrigo)
  - s/ENODEV/EOPNOTSUPP (Andi)

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |  22 ++
 drivers/gpu/drm/xe/Makefile                   |   3 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   4 +
 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h      |  33 ++
 drivers/gpu/drm/xe/xe_device.c                |   3 +
 drivers/gpu/drm/xe/xe_device_types.h          |   2 +
 drivers/gpu/drm/xe/xe_hwmon.c                 | 357 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_hwmon.h                 |  19 +
 8 files changed, 443 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
 create mode 100644 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
 create mode 100644 drivers/gpu/drm/xe/xe_hwmon.c
 create mode 100644 drivers/gpu/drm/xe/xe_hwmon.h

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
new file mode 100644
index 000000000000..da0197a29fe4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -0,0 +1,22 @@
+What:		/sys/devices/.../hwmon/hwmon<i>/power1_max
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RW. Card reactive sustained  (PL1) power limit in microwatts.
+
+		The power controller will throttle the operating frequency
+		if the power averaged over a window (typically seconds)
+		exceeds this limit. A read value of 0 means that the PL1
+		power limit is disabled, writing 0 disables the
+		limit. Writing values > 0 and <= TDP will enable the power limit.
+
+		Only supported for particular Intel xe graphics platforms.
+
+What:		/sys/devices/.../hwmon/hwmon<i>/power1_rated_max
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RO. Card default power limit (default TDP setting).
+
+		Only supported for particular Intel xe graphics platforms.
+
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index b1681d1416eb..294fee78dd29 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -118,6 +118,9 @@ xe-y += xe_bb.o \
 	xe_wa.o \
 	xe_wopcm.o
 
+# graphics hardware monitoring (HWMON) support
+xe-$(CONFIG_HWMON) += xe_hwmon.o
+
 # i915 Display compat #defines and #includes
 subdir-ccflags-$(CONFIG_DRM_XE_DISPLAY) += \
 	-I$(srctree)/$(src)/display/ext \
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index e13fbbdf6929..679cdba9f383 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -410,4 +410,8 @@
 #define XEHPC_BCS5_BCS6_INTR_MASK		XE_REG(0x190118)
 #define XEHPC_BCS7_BCS8_INTR_MASK		XE_REG(0x19011c)
 
+#define PVC_GT0_PACKAGE_RAPL_LIMIT		XE_REG(0x281008)
+#define PVC_GT0_PACKAGE_POWER_SKU_UNIT		XE_REG(0x281068)
+#define PVC_GT0_PACKAGE_POWER_SKU		XE_REG(0x281080)
+
 #endif
diff --git a/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
new file mode 100644
index 000000000000..27f1d42baf6d
--- /dev/null
+++ b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_MCHBAR_REGS_H_
+#define _XE_MCHBAR_REGS_H_
+
+#include "regs/xe_reg_defs.h"
+
+/*
+ * MCHBAR mirror.
+ *
+ * This mirrors the MCHBAR MMIO space whose location is determined by
+ * device 0 function 0's pci config register 0x44 or 0x48 and matches it in
+ * every way.
+ */
+
+#define MCHBAR_MIRROR_BASE_SNB			0x140000
+
+#define PCU_CR_PACKAGE_POWER_SKU		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x5930)
+#define   PKG_TDP				GENMASK_ULL(14, 0)
+#define   PKG_MIN_PWR				GENMASK_ULL(30, 16)
+#define   PKG_MAX_PWR				GENMASK_ULL(46, 32)
+
+#define PCU_CR_PACKAGE_POWER_SKU_UNIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x5938)
+#define   PKG_PWR_UNIT				REG_GENMASK(3, 0)
+
+#define PCU_CR_PACKAGE_RAPL_LIMIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x59a0)
+#define   PKG_PWR_LIM_1				REG_GENMASK(14, 0)
+#define   PKG_PWR_LIM_1_EN			REG_BIT(15)
+
+#endif /* _XE_MCHBAR_REGS_H_ */
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 687dc3d79a66..b9ff42a26ca3 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -35,6 +35,7 @@
 #include "xe_vm.h"
 #include "xe_vm_madvise.h"
 #include "xe_wait_user_fence.h"
+#include "xe_hwmon.h"
 
 #ifdef CONFIG_LOCKDEP
 struct lockdep_map xe_device_mem_access_lockdep_map = {
@@ -356,6 +357,8 @@ int xe_device_probe(struct xe_device *xe)
 
 	xe_pmu_register(&xe->pmu);
 
+	xe_hwmon_register(xe);
+
 	err = drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 32ab0fea04ee..519aec01fb0b 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -365,6 +365,8 @@ struct xe_device {
 	/** @pmu: performance monitoring unit */
 	struct xe_pmu pmu;
 
+	struct xe_hwmon *hwmon;
+
 	/* private: */
 
 #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
new file mode 100644
index 000000000000..44d814e111c6
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -0,0 +1,357 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <linux/hwmon.h>
+
+#include <drm/drm_managed.h>
+#include "regs/xe_gt_regs.h"
+#include "regs/xe_mchbar_regs.h"
+#include "xe_device.h"
+#include "xe_gt.h"
+#include "xe_hwmon.h"
+#include "xe_mmio.h"
+
+enum xe_hwmon_reg {
+	REG_PKG_RAPL_LIMIT,
+	REG_PKG_POWER_SKU,
+	REG_PKG_POWER_SKU_UNIT,
+};
+
+enum xe_hwmon_reg_operation {
+	REG_READ,
+	REG_WRITE,
+	REG_RMW,
+};
+
+/*
+ * SF_* - scale factors for particular quantities according to hwmon spec.
+ */
+#define SF_POWER	1000000		/* microwatts */
+
+struct xe_hwmon {
+	struct device *hwmon_dev;
+	struct xe_gt *gt;
+	struct mutex hwmon_lock; /* rmw operations*/
+	int scl_shift_power;
+};
+
+static u32 xe_hwmon_get_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg)
+{
+	struct xe_device *xe = gt_to_xe(hwmon->gt);
+	struct xe_reg reg = XE_REG(0);
+
+	switch (hwmon_reg) {
+	case REG_PKG_RAPL_LIMIT:
+		if (xe->info.platform == XE_DG2)
+			reg = PCU_CR_PACKAGE_RAPL_LIMIT;
+		else if (xe->info.platform == XE_PVC)
+			reg = PVC_GT0_PACKAGE_RAPL_LIMIT;
+		break;
+	case REG_PKG_POWER_SKU:
+		if (xe->info.platform == XE_DG2)
+			reg = PCU_CR_PACKAGE_POWER_SKU;
+		else if (xe->info.platform == XE_PVC)
+			reg = PVC_GT0_PACKAGE_POWER_SKU;
+		break;
+	case REG_PKG_POWER_SKU_UNIT:
+		if (xe->info.platform == XE_DG2)
+			reg = PCU_CR_PACKAGE_POWER_SKU_UNIT;
+		else if (xe->info.platform == XE_PVC)
+			reg = PVC_GT0_PACKAGE_POWER_SKU_UNIT;
+		break;
+	default:
+		drm_warn(&xe->drm, "Unknown xe hwmon reg id: %d\n", hwmon_reg);
+		break;
+	}
+
+	return reg.raw;
+}
+
+static int xe_hwmon_process_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg,
+				enum xe_hwmon_reg_operation operation, u32 *value,
+				u32 clr, u32 set)
+{
+	struct xe_reg reg;
+
+	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
+
+	if (!reg.raw)
+		return -EOPNOTSUPP;
+
+	switch (operation) {
+	case REG_READ:
+		*value = xe_mmio_read32(hwmon->gt, reg);
+		return 0;
+	case REG_WRITE:
+		xe_mmio_write32(hwmon->gt, reg, *value);
+		return 0;
+	case REG_RMW:
+		*value = xe_mmio_rmw32(hwmon->gt, reg, clr, set);
+		return 0;
+	default:
+		drm_warn(&gt_to_xe(hwmon->gt)->drm, "Invalid xe hwmon reg operation: %d\n",
+			 operation);
+		return -EOPNOTSUPP;
+	}
+}
+
+int xe_hwmon_process_reg_read64(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg, u64 *value)
+{
+	struct xe_reg reg;
+
+	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
+
+	if (!reg.raw)
+		return -EOPNOTSUPP;
+
+	*value = xe_mmio_read64_2x32(hwmon->gt, reg);
+
+	return 0;
+}
+
+#define PL1_DISABLE 0
+
+/*
+ * HW allows arbitrary PL1 limits to be set but silently clamps these values to
+ * "typical but not guaranteed" min/max values in REG_PKG_POWER_SKU. Follow the
+ * same pattern for sysfs, allow arbitrary PL1 limits to be set but display
+ * clamped values when read.
+ */
+static int xe_hwmon_power_max_read(struct xe_hwmon *hwmon, long *value)
+{
+	u32 reg_val;
+	u64 reg_val64, min, max;
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val, 0, 0);
+	/* Check if PL1 limit is disabled */
+	if (!(reg_val & PKG_PWR_LIM_1_EN)) {
+		*value = PL1_DISABLE;
+		return 0;
+	}
+
+	reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val);
+	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
+
+	xe_hwmon_process_reg_read64(hwmon, REG_PKG_POWER_SKU, &reg_val64);
+	min = REG_FIELD_GET(PKG_MIN_PWR, reg_val64);
+	min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power);
+	max = REG_FIELD_GET(PKG_MAX_PWR, reg_val64);
+	max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power);
+
+	if (min && max)
+		*value = clamp_t(u64, *value, min, max);
+
+	return 0;
+}
+
+static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
+{
+	u32 reg_val;
+
+	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
+	if (value == PL1_DISABLE) {
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+
+		if (reg_val & PKG_PWR_LIM_1_EN)
+			return -EOPNOTSUPP;
+	}
+
+	/* Computation in 64-bits to avoid overflow. Round to nearest. */
+	reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER);
+	reg_val = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, reg_val);
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+			     PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, reg_val);
+
+	return 0;
+}
+
+static int xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, long *value)
+{
+	u32 reg_val;
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_POWER_SKU, REG_READ, &reg_val, 0, 0);
+	reg_val = REG_FIELD_GET(PKG_TDP, reg_val);
+	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
+
+	return 0;
+}
+
+static const struct hwmon_channel_info *hwmon_info[] = {
+	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX),
+	NULL
+};
+
+static umode_t
+xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
+{
+	switch (attr) {
+	case hwmon_power_max:
+		return xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT) ? 0664 : 0;
+	case hwmon_power_rated_max:
+		return xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU) ? 0444 : 0;
+	default:
+		return 0;
+	}
+}
+
+static int
+xe_hwmon_power_read(struct xe_hwmon *hwmon, u32 attr, int chan, long *val)
+{
+	switch (attr) {
+	case hwmon_power_max:
+		return xe_hwmon_power_max_read(hwmon, val);
+	case hwmon_power_rated_max:
+		return xe_hwmon_power_rated_max_read(hwmon, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+xe_hwmon_power_write(struct xe_hwmon *hwmon, u32 attr, int chan, long val)
+{
+	switch (attr) {
+	case hwmon_power_max:
+		return xe_hwmon_power_max_write(hwmon, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t
+xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
+		    u32 attr, int channel)
+{
+	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
+	int ret;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	switch (type) {
+	case hwmon_power:
+		ret = xe_hwmon_power_is_visible(hwmon, attr, channel);
+		break;
+	default:
+		ret = 0;
+		break;
+	}
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return ret;
+}
+
+static int
+xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+	      int channel, long *val)
+{
+	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
+	int ret;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	switch (type) {
+	case hwmon_power:
+		ret = xe_hwmon_power_read(hwmon, attr, channel, val);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return ret;
+}
+
+static int
+xe_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+	       int channel, long val)
+{
+	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
+	int ret;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	switch (type) {
+	case hwmon_power:
+		ret = xe_hwmon_power_write(hwmon, attr, channel, val);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return ret;
+}
+
+static const struct hwmon_ops hwmon_ops = {
+	.is_visible = xe_hwmon_is_visible,
+	.read = xe_hwmon_read,
+	.write = xe_hwmon_write,
+};
+
+static const struct hwmon_chip_info hwmon_chip_info = {
+	.ops = &hwmon_ops,
+	.info = hwmon_info,
+};
+
+static void
+xe_hwmon_get_preregistration_info(struct xe_device *xe)
+{
+	struct xe_hwmon *hwmon = xe->hwmon;
+	u32 val_sku_unit = 0;
+	int ret;
+
+	ret = xe_hwmon_process_reg(hwmon, REG_PKG_POWER_SKU_UNIT, REG_READ, &val_sku_unit, 0, 0);
+	/*
+	 * The contents of register PKG_POWER_SKU_UNIT do not change,
+	 * so read it once and store the shift values.
+	 */
+	if (!ret)
+		hwmon->scl_shift_power = REG_FIELD_GET(PKG_PWR_UNIT, val_sku_unit);
+}
+
+void xe_hwmon_register(struct xe_device *xe)
+{
+	struct device *dev = xe->drm.dev;
+	struct xe_hwmon *hwmon;
+
+	/* hwmon is available only for dGfx */
+	if (!IS_DGFX(xe))
+		return;
+
+	hwmon = devm_kzalloc(dev, sizeof(*hwmon), GFP_KERNEL);
+	if (!hwmon)
+		return;
+
+	xe->hwmon = hwmon;
+
+	drmm_mutex_init(&xe->drm, &hwmon->hwmon_lock);
+
+	/* primary GT to access device level properties */
+	hwmon->gt = xe->tiles[0].primary_gt;
+
+	xe_hwmon_get_preregistration_info(xe);
+
+	drm_dbg(&xe->drm, "Register xe hwmon interface\n");
+
+	/*  hwmon_dev points to device hwmon<i> */
+	hwmon->hwmon_dev = devm_hwmon_device_register_with_info(dev, "xe", hwmon,
+								&hwmon_chip_info,
+								NULL);
+	if (IS_ERR(hwmon->hwmon_dev)) {
+		drm_warn(&xe->drm, "Failed to register xe hwmon (%pe)\n", hwmon->hwmon_dev);
+		xe->hwmon = NULL;
+		return;
+	}
+}
+
diff --git a/drivers/gpu/drm/xe/xe_hwmon.h b/drivers/gpu/drm/xe/xe_hwmon.h
new file mode 100644
index 000000000000..c42a1de2cd7a
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_hwmon.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_HWMON_H_
+#define _XE_HWMON_H_
+
+#include <linux/types.h>
+
+struct xe_device;
+
+#if IS_REACHABLE(CONFIG_HWMON)
+void xe_hwmon_register(struct xe_device *xe);
+#else
+static inline void xe_hwmon_register(struct xe_device *xe) { };
+#endif
+
+#endif /* _XE_HWMON_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power
  2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
@ 2023-09-25  8:18 ` Badal Nilawar
  2023-09-25  9:03   ` Andi Shyti
  2023-09-25  8:18 ` [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

Expose the card reactive critical (I1) power. I1 is exposed as
power1_crit in microwatts (typically for client products) or as
curr1_crit in milliamperes (typically for server).

v2: Move PCODE_MBOX macro to pcode file (Riana)
v3: s/IS_DG2/(gt_to_xe(gt)->info.platform == XE_DG2)
v4: Fix review comments (Andi)

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |  26 +++++
 drivers/gpu/drm/xe/xe_hwmon.c                 | 105 +++++++++++++++++-
 drivers/gpu/drm/xe/xe_pcode.h                 |   5 +
 drivers/gpu/drm/xe/xe_pcode_api.h             |   7 ++
 4 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index da0197a29fe4..37263b09b6e4 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -20,3 +20,29 @@ Description:	RO. Card default power limit (default TDP setting).
 
 		Only supported for particular Intel xe graphics platforms.
 
+What:		/sys/devices/.../hwmon/hwmon<i>/power1_crit
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RW. Card reactive critical (I1) power limit in microwatts.
+
+		Card reactive critical (I1) power limit in microwatts is exposed
+		for client products. The power controller will throttle the
+		operating frequency if the power averaged over a window exceeds
+		this limit.
+
+		Only supported for particular Intel xe graphics platforms.
+
+What:		/sys/devices/.../hwmon/hwmon<i>/curr1_crit
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RW. Card reactive critical (I1) power limit in milliamperes.
+
+		Card reactive critical (I1) power limit in milliamperes is
+		exposed for server products. The power controller will throttle
+		the operating frequency if the power averaged over a window
+		exceeds this limit.
+
+		Only supported for particular Intel xe graphics platforms.
+
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index 44d814e111c6..96e7a99738d5 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -12,6 +12,8 @@
 #include "xe_gt.h"
 #include "xe_hwmon.h"
 #include "xe_mmio.h"
+#include "xe_pcode.h"
+#include "xe_pcode_api.h"
 
 enum xe_hwmon_reg {
 	REG_PKG_RAPL_LIMIT,
@@ -29,6 +31,7 @@ enum xe_hwmon_reg_operation {
  * SF_* - scale factors for particular quantities according to hwmon spec.
  */
 #define SF_POWER	1000000		/* microwatts */
+#define SF_CURR		1000		/* milliamperes */
 
 struct xe_hwmon {
 	struct device *hwmon_dev;
@@ -183,18 +186,43 @@ static int xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, long *value)
 }
 
 static const struct hwmon_channel_info *hwmon_info[] = {
-	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX),
+	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_CRIT),
+	HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT),
 	NULL
 };
 
+/* I1 is exposed as power_crit or as curr_crit depending on bit 31 */
+static int xe_hwmon_pcode_read_i1(struct xe_gt *gt, u32 *uval)
+{
+	/* Avoid Illegal Subcommand error */
+	if (gt_to_xe(gt)->info.platform == XE_DG2)
+		return -ENXIO;
+
+	return xe_pcode_read(gt, PCODE_MBOX(PCODE_POWER_SETUP,
+			     POWER_SETUP_SUBCOMMAND_READ_I1, 0),
+			     uval, 0);
+}
+
+static int xe_hwmon_pcode_write_i1(struct xe_gt *gt, u32 uval)
+{
+	return xe_pcode_write(gt, PCODE_MBOX(PCODE_POWER_SETUP,
+			      POWER_SETUP_SUBCOMMAND_WRITE_I1, 0),
+			      uval);
+}
+
 static umode_t
 xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
 {
+	u32 uval;
+
 	switch (attr) {
 	case hwmon_power_max:
 		return xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT) ? 0664 : 0;
 	case hwmon_power_rated_max:
 		return xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU) ? 0444 : 0;
+	case hwmon_power_crit:
+		return (xe_hwmon_pcode_read_i1(hwmon->gt, &uval) ||
+			!(uval & POWER_SETUP_I1_WATTS)) ? 0 : 0644;
 	default:
 		return 0;
 	}
@@ -203,11 +231,23 @@ xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
 static int
 xe_hwmon_power_read(struct xe_hwmon *hwmon, u32 attr, int chan, long *val)
 {
+	int ret;
+	u32 uval;
+
 	switch (attr) {
 	case hwmon_power_max:
 		return xe_hwmon_power_max_read(hwmon, val);
 	case hwmon_power_rated_max:
 		return xe_hwmon_power_rated_max_read(hwmon, val);
+	case hwmon_power_crit:
+		ret = xe_hwmon_pcode_read_i1(hwmon->gt, &uval);
+		if (ret)
+			return ret;
+		if (!(uval & POWER_SETUP_I1_WATTS))
+			return -ENODEV;
+		*val = mul_u64_u32_shr(REG_FIELD_GET(POWER_SETUP_I1_DATA_MASK, uval),
+				       SF_POWER, POWER_SETUP_I1_SHIFT);
+		return 0;
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -216,9 +256,63 @@ xe_hwmon_power_read(struct xe_hwmon *hwmon, u32 attr, int chan, long *val)
 static int
 xe_hwmon_power_write(struct xe_hwmon *hwmon, u32 attr, int chan, long val)
 {
+	u32 uval;
+
 	switch (attr) {
 	case hwmon_power_max:
 		return xe_hwmon_power_max_write(hwmon, val);
+	case hwmon_power_crit:
+		uval = DIV_ROUND_CLOSEST_ULL(val << POWER_SETUP_I1_SHIFT, SF_POWER);
+		return xe_hwmon_pcode_write_i1(hwmon->gt, uval);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t
+xe_hwmon_curr_is_visible(const struct xe_hwmon *hwmon, u32 attr)
+{
+	u32 uval;
+
+	switch (attr) {
+	case hwmon_curr_crit:
+		return (xe_hwmon_pcode_read_i1(hwmon->gt, &uval) ||
+			(uval & POWER_SETUP_I1_WATTS)) ? 0 : 0644;
+	default:
+		return 0;
+	}
+}
+
+static int
+xe_hwmon_curr_read(struct xe_hwmon *hwmon, u32 attr, long *val)
+{
+	int ret;
+	u32 uval;
+
+	switch (attr) {
+	case hwmon_curr_crit:
+		ret = xe_hwmon_pcode_read_i1(hwmon->gt, &uval);
+		if (ret)
+			return ret;
+		if (uval & POWER_SETUP_I1_WATTS)
+			return -ENODEV;
+		*val = mul_u64_u32_shr(REG_FIELD_GET(POWER_SETUP_I1_DATA_MASK, uval),
+				       SF_CURR, POWER_SETUP_I1_SHIFT);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+xe_hwmon_curr_write(struct xe_hwmon *hwmon, u32 attr, long val)
+{
+	u32 uval;
+
+	switch (attr) {
+	case hwmon_curr_crit:
+		uval = DIV_ROUND_CLOSEST_ULL(val << POWER_SETUP_I1_SHIFT, SF_CURR);
+		return xe_hwmon_pcode_write_i1(hwmon->gt, uval);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -237,6 +331,9 @@ xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 	case hwmon_power:
 		ret = xe_hwmon_power_is_visible(hwmon, attr, channel);
 		break;
+	case hwmon_curr:
+		ret = xe_hwmon_curr_is_visible(hwmon, attr);
+		break;
 	default:
 		ret = 0;
 		break;
@@ -260,6 +357,9 @@ xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	case hwmon_power:
 		ret = xe_hwmon_power_read(hwmon, attr, channel, val);
 		break;
+	case hwmon_curr:
+		ret = xe_hwmon_curr_read(hwmon, attr, val);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
@@ -283,6 +383,9 @@ xe_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	case hwmon_power:
 		ret = xe_hwmon_power_write(hwmon, attr, channel, val);
 		break;
+	case hwmon_curr:
+		ret = xe_hwmon_curr_write(hwmon, attr, val);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
diff --git a/drivers/gpu/drm/xe/xe_pcode.h b/drivers/gpu/drm/xe/xe_pcode.h
index 3b4aa8c1a3ba..08cb1d047cba 100644
--- a/drivers/gpu/drm/xe/xe_pcode.h
+++ b/drivers/gpu/drm/xe/xe_pcode.h
@@ -22,4 +22,9 @@ int xe_pcode_write_timeout(struct xe_gt *gt, u32 mbox, u32 val,
 int xe_pcode_request(struct xe_gt *gt, u32 mbox, u32 request,
 		     u32 reply_mask, u32 reply, int timeout_ms);
 
+#define PCODE_MBOX(mbcmd, param1, param2)\
+	(FIELD_PREP(PCODE_MB_COMMAND, mbcmd)\
+	| FIELD_PREP(PCODE_MB_PARAM1, param1)\
+	| FIELD_PREP(PCODE_MB_PARAM2, param2))
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_pcode_api.h b/drivers/gpu/drm/xe/xe_pcode_api.h
index 837ff7c71280..5935cfe30204 100644
--- a/drivers/gpu/drm/xe/xe_pcode_api.h
+++ b/drivers/gpu/drm/xe/xe_pcode_api.h
@@ -35,6 +35,13 @@
 #define     DGFX_GET_INIT_STATUS	0x0
 #define     DGFX_INIT_STATUS_COMPLETE	0x1
 
+#define   PCODE_POWER_SETUP			0x7C
+#define     POWER_SETUP_SUBCOMMAND_READ_I1	0x4
+#define     POWER_SETUP_SUBCOMMAND_WRITE_I1	0x5
+#define	    POWER_SETUP_I1_WATTS		REG_BIT(31)
+#define	    POWER_SETUP_I1_SHIFT		6	/* 10.6 fixed point format */
+#define	    POWER_SETUP_I1_DATA_MASK		REG_GENMASK(15, 0)
+
 struct pcode_err_decode {
 	int errno;
 	const char *str;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute
  2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
  2023-09-25  8:18 ` [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
@ 2023-09-25  8:18 ` Badal Nilawar
  2023-09-25  9:04   ` Andi Shyti
  2023-09-25  8:18 ` [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
  2023-09-25  8:18 ` [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
  4 siblings, 1 reply; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

Use Xe HWMON subsystem to display the input voltage.

v2:
  - Rename hwm_get_vltg to hwm_get_voltage (Riana)
  - Use scale factor SF_VOLTAGE (Riana)
v3:
  - %s/gt_perf_status/REG_GT_PERF_STATUS/
  - Remove platform check from hwmon_get_voltage()
v4:
  - Fix review comments (Andi)

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |  6 ++
 drivers/gpu/drm/xe/regs/xe_gt_regs.h          |  3 +
 drivers/gpu/drm/xe/xe_hwmon.c                 | 58 +++++++++++++++++++
 3 files changed, 67 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index 37263b09b6e4..7f9407c20864 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -44,5 +44,11 @@ Description:	RW. Card reactive critical (I1) power limit in milliamperes.
 		the operating frequency if the power averaged over a window
 		exceeds this limit.
 
+What:		/sys/devices/.../hwmon/hwmon<i>/in0_input
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RO. Current Voltage in millivolt.
+
 		Only supported for particular Intel xe graphics platforms.
 
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 679cdba9f383..102663cbc320 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -374,6 +374,9 @@
 #define GT_GFX_RC6_LOCKED			XE_REG(0x138104)
 #define GT_GFX_RC6				XE_REG(0x138108)
 
+#define GT_PERF_STATUS				XE_REG(0x1381b4)
+#define   VOLTAGE_MASK				REG_GENMASK(10, 0)
+
 #define GT_INTR_DW(x)				XE_REG(0x190018 + ((x) * 4))
 
 #define GUC_SG_INTR_ENABLE			XE_REG(0x190038)
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index 96e7a99738d5..1079145c81c9 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -3,7 +3,9 @@
  * Copyright © 2023 Intel Corporation
  */
 
+#include <linux/hwmon-sysfs.h>
 #include <linux/hwmon.h>
+#include <linux/types.h>
 
 #include <drm/drm_managed.h>
 #include "regs/xe_gt_regs.h"
@@ -19,6 +21,7 @@ enum xe_hwmon_reg {
 	REG_PKG_RAPL_LIMIT,
 	REG_PKG_POWER_SKU,
 	REG_PKG_POWER_SKU_UNIT,
+	REG_GT_PERF_STATUS,
 };
 
 enum xe_hwmon_reg_operation {
@@ -32,6 +35,7 @@ enum xe_hwmon_reg_operation {
  */
 #define SF_POWER	1000000		/* microwatts */
 #define SF_CURR		1000		/* milliamperes */
+#define SF_VOLTAGE	1000		/* millivolts */
 
 struct xe_hwmon {
 	struct device *hwmon_dev;
@@ -64,6 +68,10 @@ static u32 xe_hwmon_get_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg)
 		else if (xe->info.platform == XE_PVC)
 			reg = PVC_GT0_PACKAGE_POWER_SKU_UNIT;
 		break;
+	case REG_GT_PERF_STATUS:
+		if (xe->info.platform == XE_DG2)
+			reg = GT_PERF_STATUS;
+		break;
 	default:
 		drm_warn(&xe->drm, "Unknown xe hwmon reg id: %d\n", hwmon_reg);
 		break;
@@ -188,6 +196,7 @@ static int xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, long *value)
 static const struct hwmon_channel_info *hwmon_info[] = {
 	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_CRIT),
 	HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT),
+	HWMON_CHANNEL_INFO(in, HWMON_I_INPUT),
 	NULL
 };
 
@@ -210,6 +219,18 @@ static int xe_hwmon_pcode_write_i1(struct xe_gt *gt, u32 uval)
 			      uval);
 }
 
+static int xe_hwmon_get_voltage(struct xe_hwmon *hwmon, long *value)
+{
+	u32 reg_val;
+
+	xe_hwmon_process_reg(hwmon, REG_GT_PERF_STATUS,
+			     REG_READ, &reg_val, 0, 0);
+	/* HW register value in units of 2.5 millivolt */
+	*value = DIV_ROUND_CLOSEST(REG_FIELD_GET(VOLTAGE_MASK, reg_val) * 2500, SF_VOLTAGE);
+
+	return 0;
+}
+
 static umode_t
 xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
 {
@@ -318,6 +339,37 @@ xe_hwmon_curr_write(struct xe_hwmon *hwmon, u32 attr, long val)
 	}
 }
 
+static umode_t
+xe_hwmon_in_is_visible(struct xe_hwmon *hwmon, u32 attr)
+{
+	switch (attr) {
+	case hwmon_in_input:
+		return xe_hwmon_get_reg(hwmon, REG_GT_PERF_STATUS) ? 0444 : 0;
+	default:
+		return 0;
+	}
+}
+
+static int
+xe_hwmon_in_read(struct xe_hwmon *hwmon, u32 attr, long *val)
+{
+	int ret;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	switch (attr) {
+	case hwmon_in_input:
+		ret = xe_hwmon_get_voltage(hwmon, val);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+	}
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return ret;
+}
+
 static umode_t
 xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 		    u32 attr, int channel)
@@ -334,6 +386,9 @@ xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 	case hwmon_curr:
 		ret = xe_hwmon_curr_is_visible(hwmon, attr);
 		break;
+	case hwmon_in:
+		ret = xe_hwmon_in_is_visible(hwmon, attr);
+		break;
 	default:
 		ret = 0;
 		break;
@@ -360,6 +415,9 @@ xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	case hwmon_curr:
 		ret = xe_hwmon_curr_read(hwmon, attr, val);
 		break;
+	case hwmon_in:
+		ret = xe_hwmon_in_read(hwmon, attr, val);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute
  2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
                   ` (2 preceding siblings ...)
  2023-09-25  8:18 ` [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
@ 2023-09-25  8:18 ` Badal Nilawar
  2023-09-25 11:49   ` Andi Shyti
  2023-09-25  8:18 ` [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
  4 siblings, 1 reply; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

Expose hwmon energy attribute to show device level energy usage

v2:
  - %s/hwm_/hwmon_/
  - Convert enums to upper case
v3:
  - %s/hwmon_/xe_hwmon
  - Remove gt specific hwmon attributes
v4:
 - %s/REG_PKG_ENERGY_STATUS/REG_ENERGY_STATUS_ALL (Riana)
 - %s/hwmon_energy_info/xe_hwmon_energy_info (Riana)

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |   7 ++
 drivers/gpu/drm/xe/regs/xe_gt_regs.h          |   2 +
 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h      |   3 +
 drivers/gpu/drm/xe/xe_hwmon.c                 | 105 +++++++++++++++++-
 4 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index 7f9407c20864..1a7a6c23e141 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -52,3 +52,10 @@ Description:	RO. Current Voltage in millivolt.
 
 		Only supported for particular Intel xe graphics platforms.
 
+What:		/sys/devices/.../hwmon/hwmon<i>/energy1_input
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RO. Energy input of device in microjoules.
+
+		Only supported for particular Intel xe graphics platforms.
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 102663cbc320..9e6ce74fdd68 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -413,8 +413,10 @@
 #define XEHPC_BCS5_BCS6_INTR_MASK		XE_REG(0x190118)
 #define XEHPC_BCS7_BCS8_INTR_MASK		XE_REG(0x19011c)
 
+#define PVC_GT0_PACKAGE_ENERGY_STATUS		XE_REG(0x281004)
 #define PVC_GT0_PACKAGE_RAPL_LIMIT		XE_REG(0x281008)
 #define PVC_GT0_PACKAGE_POWER_SKU_UNIT		XE_REG(0x281068)
+#define PVC_GT0_PLATFORM_ENERGY_STATUS		XE_REG(0x28106c)
 #define PVC_GT0_PACKAGE_POWER_SKU		XE_REG(0x281080)
 
 #endif
diff --git a/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
index 27f1d42baf6d..d8ecbe1858d1 100644
--- a/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
@@ -25,6 +25,9 @@
 
 #define PCU_CR_PACKAGE_POWER_SKU_UNIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x5938)
 #define   PKG_PWR_UNIT				REG_GENMASK(3, 0)
+#define   PKG_ENERGY_UNIT			REG_GENMASK(12, 8)
+
+#define PCU_CR_PACKAGE_ENERGY_STATUS		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x593c)
 
 #define PCU_CR_PACKAGE_RAPL_LIMIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x59a0)
 #define   PKG_PWR_LIM_1				REG_GENMASK(14, 0)
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index 1079145c81c9..1deb5007e1e2 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -22,6 +22,7 @@ enum xe_hwmon_reg {
 	REG_PKG_POWER_SKU,
 	REG_PKG_POWER_SKU_UNIT,
 	REG_GT_PERF_STATUS,
+	REG_PKG_ENERGY_STATUS,
 };
 
 enum xe_hwmon_reg_operation {
@@ -36,12 +37,20 @@ enum xe_hwmon_reg_operation {
 #define SF_POWER	1000000		/* microwatts */
 #define SF_CURR		1000		/* milliamperes */
 #define SF_VOLTAGE	1000		/* millivolts */
+#define SF_ENERGY	1000000		/* microjoules */
+
+struct xe_hwmon_energy_info {
+	u32 reg_val_prev;
+	long accum_energy;		/* Accumulated energy for energy1_input */
+};
 
 struct xe_hwmon {
 	struct device *hwmon_dev;
 	struct xe_gt *gt;
 	struct mutex hwmon_lock; /* rmw operations*/
 	int scl_shift_power;
+	int scl_shift_energy;
+	struct xe_hwmon_energy_info ei;	/*  Energy info for energy1_input */
 };
 
 static u32 xe_hwmon_get_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg)
@@ -72,6 +81,12 @@ static u32 xe_hwmon_get_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg)
 		if (xe->info.platform == XE_DG2)
 			reg = GT_PERF_STATUS;
 		break;
+	case REG_PKG_ENERGY_STATUS:
+		if (xe->info.platform == XE_DG2)
+			reg = PCU_CR_PACKAGE_ENERGY_STATUS;
+		else if (xe->info.platform == XE_PVC)
+			reg = PVC_GT0_PLATFORM_ENERGY_STATUS;
+		break;
 	default:
 		drm_warn(&xe->drm, "Unknown xe hwmon reg id: %d\n", hwmon_reg);
 		break;
@@ -193,10 +208,59 @@ static int xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, long *value)
 	return 0;
 }
 
+/*
+ * xe_hwmon_energy_get - Obtain energy value
+ *
+ * The underlying energy hardware register is 32-bits and is subject to
+ * overflow. How long before overflow? For example, with an example
+ * scaling bit shift of 14 bits (see register *PACKAGE_POWER_SKU_UNIT) and
+ * a power draw of 1000 watts, the 32-bit counter will overflow in
+ * approximately 4.36 minutes.
+ *
+ * Examples:
+ *    1 watt:  (2^32 >> 14) /    1 W / (60 * 60 * 24) secs/day -> 3 days
+ * 1000 watts: (2^32 >> 14) / 1000 W / 60             secs/min -> 4.36 minutes
+ *
+ * The function significantly increases overflow duration (from 4.36
+ * minutes) by accumulating the energy register into a 'long' as allowed by
+ * the hwmon API. Using x86_64 128 bit arithmetic (see mul_u64_u32_shr()),
+ * a 'long' of 63 bits, SF_ENERGY of 1e6 (~20 bits) and
+ * hwmon->scl_shift_energy of 14 bits we have 57 (63 - 20 + 14) bits before
+ * energy1_input overflows. This at 1000 W is an overflow duration of 278 years.
+ */
+static void
+xe_hwmon_energy_get(struct xe_hwmon *hwmon, long *energy)
+{
+	struct xe_hwmon_energy_info *ei = &hwmon->ei;
+	u32 reg_val;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	mutex_lock(&hwmon->hwmon_lock);
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_ENERGY_STATUS, REG_READ,
+			     &reg_val, 0, 0);
+
+	if (reg_val >= ei->reg_val_prev)
+		ei->accum_energy += reg_val - ei->reg_val_prev;
+	else
+		ei->accum_energy += UINT_MAX - ei->reg_val_prev + reg_val;
+
+	ei->reg_val_prev = reg_val;
+
+	*energy = mul_u64_u32_shr(ei->accum_energy, SF_ENERGY,
+				  hwmon->scl_shift_energy);
+
+	mutex_unlock(&hwmon->hwmon_lock);
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+}
+
 static const struct hwmon_channel_info *hwmon_info[] = {
 	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_CRIT),
 	HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT),
 	HWMON_CHANNEL_INFO(in, HWMON_I_INPUT),
+	HWMON_CHANNEL_INFO(energy, HWMON_E_INPUT),
 	NULL
 };
 
@@ -370,6 +434,29 @@ xe_hwmon_in_read(struct xe_hwmon *hwmon, u32 attr, long *val)
 	return ret;
 }
 
+static umode_t
+xe_hwmon_energy_is_visible(struct xe_hwmon *hwmon, u32 attr)
+{
+	switch (attr) {
+	case hwmon_energy_input:
+		return xe_hwmon_get_reg(hwmon, REG_PKG_ENERGY_STATUS) ? 0444 : 0;
+	default:
+		return 0;
+	}
+}
+
+static int
+xe_hwmon_energy_read(struct xe_hwmon *hwmon, u32 attr, long *val)
+{
+	switch (attr) {
+	case hwmon_energy_input:
+		xe_hwmon_energy_get(hwmon, val);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static umode_t
 xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 		    u32 attr, int channel)
@@ -389,6 +476,9 @@ xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
 	case hwmon_in:
 		ret = xe_hwmon_in_is_visible(hwmon, attr);
 		break;
+	case hwmon_energy:
+		ret = xe_hwmon_energy_is_visible(hwmon, attr);
+		break;
 	default:
 		ret = 0;
 		break;
@@ -418,6 +508,9 @@ xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 	case hwmon_in:
 		ret = xe_hwmon_in_read(hwmon, attr, val);
 		break;
+	case hwmon_energy:
+		ret = xe_hwmon_energy_read(hwmon, attr, val);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
@@ -469,6 +562,7 @@ static void
 xe_hwmon_get_preregistration_info(struct xe_device *xe)
 {
 	struct xe_hwmon *hwmon = xe->hwmon;
+	long energy;
 	u32 val_sku_unit = 0;
 	int ret;
 
@@ -477,8 +571,17 @@ xe_hwmon_get_preregistration_info(struct xe_device *xe)
 	 * The contents of register PKG_POWER_SKU_UNIT do not change,
 	 * so read it once and store the shift values.
 	 */
-	if (!ret)
+	if (!ret) {
 		hwmon->scl_shift_power = REG_FIELD_GET(PKG_PWR_UNIT, val_sku_unit);
+		hwmon->scl_shift_energy = REG_FIELD_GET(PKG_ENERGY_UNIT, val_sku_unit);
+	}
+
+	/*
+	 * Initialize 'struct xe_hwmon_energy_info', i.e. set fields to the
+	 * first value of the energy register read
+	 */
+	if (xe_hwmon_is_visible(hwmon, hwmon_energy, hwmon_energy_input, 0))
+		xe_hwmon_energy_get(hwmon, &energy);
 }
 
 void xe_hwmon_register(struct xe_device *xe)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
                   ` (3 preceding siblings ...)
  2023-09-25  8:18 ` [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
@ 2023-09-25  8:18 ` Badal Nilawar
  2023-09-25 11:56   ` Andi Shyti
  4 siblings, 1 reply; 32+ messages in thread
From: Badal Nilawar @ 2023-09-25  8:18 UTC (permalink / raw)
  To: intel-xe, linux-hwmon
  Cc: anshuman.gupta, ashutosh.dixit, linux, andi.shyti, riana.tauro,
	matthew.brost, rodrigo.vivi

Expose power1_max_interval, that is the tau corresponding to PL1, as a
custom hwmon attribute. Some bit manipulation is needed because of the
format of PKG_PWR_LIM_1_TIME in
PACKAGE_RAPL_LIMIT register (1.x * power(2,y))

v2: Get rpm wake ref while accessing power1_max_interval
v3: %s/hwmon/xe_hwmon/

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-xe-hwmon   |  11 ++
 drivers/gpu/drm/xe/regs/xe_mchbar_regs.h      |   8 +
 drivers/gpu/drm/xe/xe_hwmon.c                 | 138 +++++++++++++++++-
 3 files changed, 156 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
index 1a7a6c23e141..9ceb9c04b52b 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
@@ -59,3 +59,14 @@ Contact:	intel-xe@lists.freedesktop.org
 Description:	RO. Energy input of device in microjoules.
 
 		Only supported for particular Intel xe graphics platforms.
+
+What:		/sys/devices/.../hwmon/hwmon<i>/power1_max_interval
+Date:		September 2023
+KernelVersion:	6.5
+Contact:	intel-xe@lists.freedesktop.org
+Description:	RW. Sustained power limit interval (Tau in PL1/Tau) in
+		milliseconds over which sustained power is averaged.
+
+		Only supported for particular Intel xe graphics platforms.
+
+
diff --git a/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
index d8ecbe1858d1..519dd1067a19 100644
--- a/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_mchbar_regs.h
@@ -22,15 +22,23 @@
 #define   PKG_TDP				GENMASK_ULL(14, 0)
 #define   PKG_MIN_PWR				GENMASK_ULL(30, 16)
 #define   PKG_MAX_PWR				GENMASK_ULL(46, 32)
+#define   PKG_MAX_WIN				GENMASK_ULL(54, 48)
+#define     PKG_MAX_WIN_X			GENMASK_ULL(54, 53)
+#define     PKG_MAX_WIN_Y			GENMASK_ULL(52, 48)
+
 
 #define PCU_CR_PACKAGE_POWER_SKU_UNIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x5938)
 #define   PKG_PWR_UNIT				REG_GENMASK(3, 0)
 #define   PKG_ENERGY_UNIT			REG_GENMASK(12, 8)
+#define   PKG_TIME_UNIT				REG_GENMASK(19, 16)
 
 #define PCU_CR_PACKAGE_ENERGY_STATUS		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x593c)
 
 #define PCU_CR_PACKAGE_RAPL_LIMIT		XE_REG(MCHBAR_MIRROR_BASE_SNB + 0x59a0)
 #define   PKG_PWR_LIM_1				REG_GENMASK(14, 0)
 #define   PKG_PWR_LIM_1_EN			REG_BIT(15)
+#define   PKG_PWR_LIM_1_TIME			REG_GENMASK(23, 17)
+#define   PKG_PWR_LIM_1_TIME_X			REG_GENMASK(23, 22)
+#define   PKG_PWR_LIM_1_TIME_Y			REG_GENMASK(21, 17)
 
 #endif /* _XE_MCHBAR_REGS_H_ */
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index 1deb5007e1e2..5bd5e5e9958b 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -38,6 +38,7 @@ enum xe_hwmon_reg_operation {
 #define SF_CURR		1000		/* milliamperes */
 #define SF_VOLTAGE	1000		/* millivolts */
 #define SF_ENERGY	1000000		/* microjoules */
+#define SF_TIME		1000		/* milliseconds */
 
 struct xe_hwmon_energy_info {
 	u32 reg_val_prev;
@@ -50,6 +51,7 @@ struct xe_hwmon {
 	struct mutex hwmon_lock; /* rmw operations*/
 	int scl_shift_power;
 	int scl_shift_energy;
+	int scl_shift_time;
 	struct xe_hwmon_energy_info ei;	/*  Energy info for energy1_input */
 };
 
@@ -256,6 +258,138 @@ xe_hwmon_energy_get(struct xe_hwmon *hwmon, long *energy)
 	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
 }
 
+static ssize_t
+xe_hwmon_power1_max_interval_show(struct device *dev, struct device_attribute *attr,
+				  char *buf)
+{
+	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
+	u32 r, x, y, x_w = 2; /* 2 bits */
+	u64 tau4, out;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT,
+			     REG_READ, &r, 0, 0);
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	x = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_X, r);
+	y = REG_FIELD_GET(PKG_PWR_LIM_1_TIME_Y, r);
+	/*
+	 * tau = 1.x * power(2,y), x = bits(23:22), y = bits(21:17)
+	 *     = (4 | x) << (y - 2)
+	 * where (y - 2) ensures a 1.x fixed point representation of 1.x
+	 * However because y can be < 2, we compute
+	 *     tau4 = (4 | x) << y
+	 * but add 2 when doing the final right shift to account for units
+	 */
+	tau4 = ((1 << x_w) | x) << y;
+	/* val in hwmon interface units (millisec) */
+	out = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
+
+	return sysfs_emit(buf, "%llu\n", out);
+}
+
+static ssize_t
+xe_hwmon_power1_max_interval_store(struct device *dev, struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
+	u32 x, y, rxy, x_w = 2; /* 2 bits */
+	u64 tau4, r, max_win;
+	unsigned long val;
+	int ret;
+
+	ret = kstrtoul(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	/*
+	 * Max HW supported tau in '1.x * power(2,y)' format, x = 0, y = 0x12
+	 * The hwmon->scl_shift_time default of 0xa results in a max tau of 256 seconds
+	 */
+#define PKG_MAX_WIN_DEFAULT 0x12ull
+
+	/*
+	 * val must be < max in hwmon interface units. The steps below are
+	 * explained in xe_hwmon_power1_max_interval_show()
+	 */
+	r = FIELD_PREP(PKG_MAX_WIN, PKG_MAX_WIN_DEFAULT);
+	x = REG_FIELD_GET(PKG_MAX_WIN_X, r);
+	y = REG_FIELD_GET(PKG_MAX_WIN_Y, r);
+	tau4 = ((1 << x_w) | x) << y;
+	max_win = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
+
+	if (val > max_win)
+		return -EINVAL;
+
+	/* val in hw units */
+	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
+	/* Convert to 1.x * power(2,y) */
+	if (!val) {
+		/* Avoid ilog2(0) */
+		y = 0;
+		x = 0;
+	} else {
+		y = ilog2(val);
+		/* x = (val - (1 << y)) >> (y - 2); */
+		x = (val - (1ul << y)) << x_w >> y;
+	}
+
+	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	mutex_lock(&hwmon->hwmon_lock);
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
+			     PKG_PWR_LIM_1_TIME, rxy);
+
+	mutex_unlock(&hwmon->hwmon_lock);
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return count;
+}
+
+static SENSOR_DEVICE_ATTR(power1_max_interval, 0664,
+			  xe_hwmon_power1_max_interval_show,
+			  xe_hwmon_power1_max_interval_store, 0);
+
+static struct attribute *hwmon_attributes[] = {
+	&sensor_dev_attr_power1_max_interval.dev_attr.attr,
+	NULL
+};
+
+static umode_t xe_hwmon_attributes_visible(struct kobject *kobj,
+					   struct attribute *attr, int index)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
+	u32 reg_val;
+	int ret = 0;
+
+	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
+
+	if (attr == &sensor_dev_attr_power1_max_interval.dev_attr.attr)
+		ret =  xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT,
+					    REG_READ, &reg_val, 0, 0) ? 0 : attr->mode;
+
+	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
+
+	return ret;
+}
+
+static const struct attribute_group hwmon_attrgroup = {
+	.attrs = hwmon_attributes,
+	.is_visible = xe_hwmon_attributes_visible,
+};
+
+static const struct attribute_group *hwmon_groups[] = {
+	&hwmon_attrgroup,
+	NULL
+};
+
 static const struct hwmon_channel_info *hwmon_info[] = {
 	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_CRIT),
 	HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT),
@@ -574,6 +708,7 @@ xe_hwmon_get_preregistration_info(struct xe_device *xe)
 	if (!ret) {
 		hwmon->scl_shift_power = REG_FIELD_GET(PKG_PWR_UNIT, val_sku_unit);
 		hwmon->scl_shift_energy = REG_FIELD_GET(PKG_ENERGY_UNIT, val_sku_unit);
+		hwmon->scl_shift_time = REG_FIELD_GET(PKG_TIME_UNIT, val_sku_unit);
 	}
 
 	/*
@@ -611,7 +746,8 @@ void xe_hwmon_register(struct xe_device *xe)
 	/*  hwmon_dev points to device hwmon<i> */
 	hwmon->hwmon_dev = devm_hwmon_device_register_with_info(dev, "xe", hwmon,
 								&hwmon_chip_info,
-								NULL);
+								hwmon_groups);
+
 	if (IS_ERR(hwmon->hwmon_dev)) {
 		drm_warn(&xe->drm, "Failed to register xe hwmon (%pe)\n", hwmon->hwmon_dev);
 		xe->hwmon = NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
@ 2023-09-25  8:58   ` Andi Shyti
  2023-09-27  4:45   ` Dixit, Ashutosh
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Andi Shyti @ 2023-09-25  8:58 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	andi.shyti, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

On Mon, Sep 25, 2023 at 01:48:38PM +0530, Badal Nilawar wrote:
> Expose Card reactive sustained (pl1) power limit as power_max and
> card default power limit (tdp) as power_rated_max.
> 
> v2:
>   - Fix review comments (Riana)
> v3:
>   - Use drmm_mutex_init (Matt Brost)
>   - Print error value (Matt Brost)
>   - Convert enums to uppercase (Matt Brost)
>   - Avoid extra reg read in hwmon_is_visible function (Riana)
>   - Use xe_device_assert_mem_access when applicable (Matt Brost)
>   - Add intel-xe@lists.freedesktop.org in Documentation (Matt Brost)
> v4:
>   - Use prefix xe_hwmon prefix for all functions (Matt Brost/Andi)
>   - %s/hwmon_reg/xe_hwmon_reg (Andi)
>   - Fix review comments (Guenter/Andi)
> v5:
>   - Fix review comments (Riana)
> v6:
>   - Use drm_warn in default case (Rodrigo)
>   - s/ENODEV/EOPNOTSUPP (Andi)
> 
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Riana Tauro <riana.tauro@intel.com>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>

looks good to me:

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power
  2023-09-25  8:18 ` [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
@ 2023-09-25  9:03   ` Andi Shyti
  0 siblings, 0 replies; 32+ messages in thread
From: Andi Shyti @ 2023-09-25  9:03 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	andi.shyti, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

On Mon, Sep 25, 2023 at 01:48:39PM +0530, Badal Nilawar wrote:
> Expose the card reactive critical (I1) power. I1 is exposed as
> power1_crit in microwatts (typically for client products) or as
> curr1_crit in milliamperes (typically for server).
> 
> v2: Move PCODE_MBOX macro to pcode file (Riana)
> v3: s/IS_DG2/(gt_to_xe(gt)->info.platform == XE_DG2)
> v4: Fix review comments (Andi)
> 
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Riana Tauro <riana.tauro@intel.com>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute
  2023-09-25  8:18 ` [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
@ 2023-09-25  9:04   ` Andi Shyti
  0 siblings, 0 replies; 32+ messages in thread
From: Andi Shyti @ 2023-09-25  9:04 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	andi.shyti, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

On Mon, Sep 25, 2023 at 01:48:40PM +0530, Badal Nilawar wrote:
> Use Xe HWMON subsystem to display the input voltage.
> 
> v2:
>   - Rename hwm_get_vltg to hwm_get_voltage (Riana)
>   - Use scale factor SF_VOLTAGE (Riana)
> v3:
>   - %s/gt_perf_status/REG_GT_PERF_STATUS/
>   - Remove platform check from hwmon_get_voltage()
> v4:
>   - Fix review comments (Andi)
> 
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Riana Tauro <riana.tauro@intel.com>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute
  2023-09-25  8:18 ` [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
@ 2023-09-25 11:49   ` Andi Shyti
  0 siblings, 0 replies; 32+ messages in thread
From: Andi Shyti @ 2023-09-25 11:49 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	andi.shyti, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

[...]

> +/*
> + * xe_hwmon_energy_get - Obtain energy value
> + *
> + * The underlying energy hardware register is 32-bits and is subject to
> + * overflow. How long before overflow? For example, with an example
> + * scaling bit shift of 14 bits (see register *PACKAGE_POWER_SKU_UNIT) and
> + * a power draw of 1000 watts, the 32-bit counter will overflow in
> + * approximately 4.36 minutes.
> + *
> + * Examples:
> + *    1 watt:  (2^32 >> 14) /    1 W / (60 * 60 * 24) secs/day -> 3 days
> + * 1000 watts: (2^32 >> 14) / 1000 W / 60             secs/min -> 4.36 minutes
> + *
> + * The function significantly increases overflow duration (from 4.36
> + * minutes) by accumulating the energy register into a 'long' as allowed by
> + * the hwmon API. Using x86_64 128 bit arithmetic (see mul_u64_u32_shr()),
> + * a 'long' of 63 bits, SF_ENERGY of 1e6 (~20 bits) and
> + * hwmon->scl_shift_energy of 14 bits we have 57 (63 - 20 + 14) bits before
> + * energy1_input overflows. This at 1000 W is an overflow duration of 278 years.
> + */
> +static void
> +xe_hwmon_energy_get(struct xe_hwmon *hwmon, long *energy)

can we call it rather xe_hwmon_energy_update() as this is not
sending anything, but just updating the energy values.

Anyway...

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-25  8:18 ` [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
@ 2023-09-25 11:56   ` Andi Shyti
       [not found]     ` <e5801f36-2f9a-6d24-7af2-1e7174f2e0b4@intel.com>
  0 siblings, 1 reply; 32+ messages in thread
From: Andi Shyti @ 2023-09-25 11:56 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	andi.shyti, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

[...]

> +static ssize_t
> +xe_hwmon_power1_max_interval_store(struct device *dev, struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
> +	u32 x, y, rxy, x_w = 2; /* 2 bits */
> +	u64 tau4, r, max_win;
> +	unsigned long val;
> +	int ret;
> +
> +	ret = kstrtoul(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Max HW supported tau in '1.x * power(2,y)' format, x = 0, y = 0x12
> +	 * The hwmon->scl_shift_time default of 0xa results in a max tau of 256 seconds
> +	 */
> +#define PKG_MAX_WIN_DEFAULT 0x12ull
> +
> +	/*
> +	 * val must be < max in hwmon interface units. The steps below are
> +	 * explained in xe_hwmon_power1_max_interval_show()
> +	 */
> +	r = FIELD_PREP(PKG_MAX_WIN, PKG_MAX_WIN_DEFAULT);
> +	x = REG_FIELD_GET(PKG_MAX_WIN_X, r);
> +	y = REG_FIELD_GET(PKG_MAX_WIN_Y, r);
> +	tau4 = ((1 << x_w) | x) << y;
> +	max_win = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
> +
> +	if (val > max_win)
> +		return -EINVAL;
> +
> +	/* val in hw units */
> +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
> +	/* Convert to 1.x * power(2,y) */
> +	if (!val) {
> +		/* Avoid ilog2(0) */
> +		y = 0;
> +		x = 0;
> +	} else {
> +		y = ilog2(val);
> +		/* x = (val - (1 << y)) >> (y - 2); */

this is some spurious development comment, can you please remove
it?

> +		x = (val - (1ul << y)) << x_w >> y;
> +	}
> +
> +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
> +
> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> +
> +	mutex_lock(&hwmon->hwmon_lock);
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
> +			     PKG_PWR_LIM_1_TIME, rxy);
> +
> +	mutex_unlock(&hwmon->hwmon_lock);

why are we locking here?

Andi

> +
> +	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
> +
> +	return count;
> +}

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
       [not found]     ` <e5801f36-2f9a-6d24-7af2-1e7174f2e0b4@intel.com>
@ 2023-09-26  8:01       ` Andi Shyti
  2023-09-26  9:00         ` Nilawar, Badal
  0 siblings, 1 reply; 32+ messages in thread
From: Andi Shyti @ 2023-09-26  8:01 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: Andi Shyti, intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit,
	linux, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

> > > +	/* val in hw units */
> > > +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
> > > +	/* Convert to 1.x * power(2,y) */
> > > +	if (!val) {
> > > +		/* Avoid ilog2(0) */
> > > +		y = 0;
> > > +		x = 0;
> > > +	} else {
> > > +		y = ilog2(val);
> > > +		/* x = (val - (1 << y)) >> (y - 2); */
> > 
> > this is some spurious development comment, can you please remove
> > it?
> 
> This is kept intentionally to help to understand the calculations.

then this is confusing... Can you please expand the concept?
As it is it's not understandable and I would expect someone
sending a patch with title:

 [PATCH] drm/xe/hwmon: Remove spurious comment

Because it just looks forgotten from previous development.

> > > +		x = (val - (1ul << y)) << x_w >> y;
> > > +	}
> > > +
> > > +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
> > > +
> > > +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > +
> > > +	mutex_lock(&hwmon->hwmon_lock);
> > > +
> > > +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
> > > +			     PKG_PWR_LIM_1_TIME, rxy);
> > > +
> > > +	mutex_unlock(&hwmon->hwmon_lock);
> > 
> > why are we locking here?
> 
> Since it is rmw operation we are using lock here.

OK... so what you are trying to protect here is the

  read -> update -> write

and it makes sense. The problem is that if this is a generic
rule, which means that everyone who will do a rmw operation has
to take the lock, why not take the lock directly in
xe_hwmon_process_reg()?

But also this can be a bit confusing, because a function is
either locked or unlocked and purists might complain.

A suggestion would be to do something like:

   static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation operation)
   {
   	...
   }

   static int xe_hwmon_reg_read(...);
   {
   	return xe_hwmon_process_reg(..., REG_READ);
   }

   static int xe_hwmon_reg_write(...);
   {
   	return xe_hwmon_process_reg(..., REG_WRITE);
   }

   static int xe_hwmon_reg_rmw(...);
   {
	int ret;
   	
	/*
	 * Optional: you can check that the lock is not taken
	 * to shout loud if potential deadlocks arise.
	 */

	/*
	 * We want to protect the register update with the
	 * lock blah blah blah... explanatory comment.
	 */
	mutex_lock(&hwmon->hwmon_lock);
	ret = xe_hwmon_process_reg(..., REG_RMW);
	mutex_unlock(&hwmon->hwmon_lock);

	return ret;
   }

What do you think? It looks much clearer to me.

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-26  8:01       ` Andi Shyti
@ 2023-09-26  9:00         ` Nilawar, Badal
  2023-09-26 21:01           ` Andi Shyti
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-09-26  9:00 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit, linux,
	riana.tauro, matthew.brost, rodrigo.vivi

Hi Andi,

On 26-09-2023 13:31, Andi Shyti wrote:
> Hi Badal,
> 
>>>> +	/* val in hw units */
>>>> +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
>>>> +	/* Convert to 1.x * power(2,y) */
>>>> +	if (!val) {
>>>> +		/* Avoid ilog2(0) */
>>>> +		y = 0;
>>>> +		x = 0;
>>>> +	} else {
>>>> +		y = ilog2(val);
>>>> +		/* x = (val - (1 << y)) >> (y - 2); */
>>>
>>> this is some spurious development comment, can you please remove
>>> it?
>>
>> This is kept intentionally to help to understand the calculations.
> 
> then this is confusing... Can you please expand the concept?
> As it is it's not understandable and I would expect someone
> sending a patch with title:
> 
>   [PATCH] drm/xe/hwmon: Remove spurious comment
> 
> Because it just looks forgotten from previous development.
I will add this comment inside the comment at the top of if. So it will 
look like.
/*
  * Convert to 1.x * power(2,y)
  * y = ilog(val);
  * x = (val - (1 << y)) >> (y-2);
  */
> 
>>>> +		x = (val - (1ul << y)) << x_w >> y;
>>>> +	}
>>>> +
>>>> +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
>>>> +
>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>> +
>>>> +	mutex_lock(&hwmon->hwmon_lock);
>>>> +
>>>> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
>>>> +			     PKG_PWR_LIM_1_TIME, rxy);
>>>> +
>>>> +	mutex_unlock(&hwmon->hwmon_lock);
>>>
>>> why are we locking here?
>>
>> Since it is rmw operation we are using lock here.
> 
> OK... so what you are trying to protect here is the
> 
>    read -> update -> write
> 
> and it makes sense. The problem is that if this is a generic
> rule, which means that everyone who will do a rmw operation has
> to take the lock, why not take the lock directly in
> xe_hwmon_process_reg()?
> 
> But also this can be a bit confusing, because a function is
> either locked or unlocked and purists might complain.
> 
> A suggestion would be to do something like:
> 
>     static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation operation)
>     {
>     	...
>     }
> 
>     static int xe_hwmon_reg_read(...);
>     {
>     	return xe_hwmon_process_reg(..., REG_READ);
>     }
> 
>     static int xe_hwmon_reg_write(...);
>     {
>     	return xe_hwmon_process_reg(..., REG_WRITE);
>     }
> 
>     static int xe_hwmon_reg_rmw(...);
>     {
> 	int ret;
>     	
> 	/*
> 	 * Optional: you can check that the lock is not taken
> 	 * to shout loud if potential deadlocks arise.
> 	 */
> 
> 	/*
> 	 * We want to protect the register update with the
> 	 * lock blah blah blah... explanatory comment.
> 	 */
> 	mutex_lock(&hwmon->hwmon_lock);
> 	ret = xe_hwmon_process_reg(..., REG_RMW);
> 	mutex_unlock(&hwmon->hwmon_lock);
> 
> 	return ret;
>     }
> 
> What do you think? It looks much clearer to me.
REG_PKG_RAPL_LIMIT register is being written in xe_hwmon_power_max_write 
also, that's why lock is taken. But some how while cleaning up I forgot 
to take it in xe_hwmon_power_max_write(), thanks for catching it up. I 
will update xe_hwmon_power_max_write() and resend series.

Thanks,
Badal
> 
> Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-26  9:00         ` Nilawar, Badal
@ 2023-09-26 21:01           ` Andi Shyti
  2023-09-27  3:32             ` Dixit, Ashutosh
  0 siblings, 1 reply; 32+ messages in thread
From: Andi Shyti @ 2023-09-26 21:01 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: Andi Shyti, intel-xe, linux-hwmon, anshuman.gupta, ashutosh.dixit,
	linux, riana.tauro, matthew.brost, rodrigo.vivi

Hi Badal,

> > > > > +	/* val in hw units */
> > > > > +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
> > > > > +	/* Convert to 1.x * power(2,y) */
> > > > > +	if (!val) {
> > > > > +		/* Avoid ilog2(0) */
> > > > > +		y = 0;
> > > > > +		x = 0;
> > > > > +	} else {
> > > > > +		y = ilog2(val);
> > > > > +		/* x = (val - (1 << y)) >> (y - 2); */
> > > > 
> > > > this is some spurious development comment, can you please remove
> > > > it?
> > > 
> > > This is kept intentionally to help to understand the calculations.
> > 
> > then this is confusing... Can you please expand the concept?
> > As it is it's not understandable and I would expect someone
> > sending a patch with title:
> > 
> >   [PATCH] drm/xe/hwmon: Remove spurious comment
> > 
> > Because it just looks forgotten from previous development.
> I will add this comment inside the comment at the top of if. So it will look
> like.
> /*
>  * Convert to 1.x * power(2,y)
>  * y = ilog(val);
>  * x = (val - (1 << y)) >> (y-2);
>  */

All right.

> > > > > +		x = (val - (1ul << y)) << x_w >> y;
> > > > > +	}
> > > > > +
> > > > > +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
> > > > > +
> > > > > +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > > > +
> > > > > +	mutex_lock(&hwmon->hwmon_lock);
> > > > > +
> > > > > +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
> > > > > +			     PKG_PWR_LIM_1_TIME, rxy);
> > > > > +
> > > > > +	mutex_unlock(&hwmon->hwmon_lock);
> > > > 
> > > > why are we locking here?
> > > 
> > > Since it is rmw operation we are using lock here.
> > 
> > OK... so what you are trying to protect here is the
> > 
> >    read -> update -> write
> > 
> > and it makes sense. The problem is that if this is a generic
> > rule, which means that everyone who will do a rmw operation has
> > to take the lock, why not take the lock directly in
> > xe_hwmon_process_reg()?
> > 
> > But also this can be a bit confusing, because a function is
> > either locked or unlocked and purists might complain.
> > 
> > A suggestion would be to do something like:
> > 
> >     static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation operation)
> >     {
> >     	...
> >     }
> > 
> >     static int xe_hwmon_reg_read(...);
> >     {
> >     	return xe_hwmon_process_reg(..., REG_READ);
> >     }
> > 
> >     static int xe_hwmon_reg_write(...);
> >     {
> >     	return xe_hwmon_process_reg(..., REG_WRITE);
> >     }
> > 
> >     static int xe_hwmon_reg_rmw(...);
> >     {
> > 	int ret;
> >     	
> > 	/*
> > 	 * Optional: you can check that the lock is not taken
> > 	 * to shout loud if potential deadlocks arise.
> > 	 */
> > 
> > 	/*
> > 	 * We want to protect the register update with the
> > 	 * lock blah blah blah... explanatory comment.
> > 	 */
> > 	mutex_lock(&hwmon->hwmon_lock);
> > 	ret = xe_hwmon_process_reg(..., REG_RMW);
> > 	mutex_unlock(&hwmon->hwmon_lock);
> > 
> > 	return ret;
> >     }
> > 
> > What do you think? It looks much clearer to me.
> 
> REG_PKG_RAPL_LIMIT register is being written in xe_hwmon_power_max_write
> also, that's why lock is taken. But some how while cleaning up I forgot to
> take it in xe_hwmon_power_max_write(), thanks for catching it up. I will
> update xe_hwmon_power_max_write() and resend series.

yes... OK... then, please add the lock also in the write case.

But still... thinking of hwmon running in more threads don't you
think we might need a more generic locking? This might mean to
lock all around xe_hwmon_process_reg()... maybe it's an overkill.

For the time being I'm OK with your current solution.

If you don't like my suggestion above, feel free to ignore it.

Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-26 21:01           ` Andi Shyti
@ 2023-09-27  3:32             ` Dixit, Ashutosh
  2023-09-27  9:04               ` Nilawar, Badal
  0 siblings, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-27  3:32 UTC (permalink / raw)
  To: Andi Shyti, Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, riana.tauro,
	matthew.brost, rodrigo.vivi

On Tue, 26 Sep 2023 14:01:06 -0700, Andi Shyti wrote:
>

Hi Badal/Andi,

>
> > > > > > +	/* val in hw units */
> > > > > > +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
> > > > > > +	/* Convert to 1.x * power(2,y) */
> > > > > > +	if (!val) {
> > > > > > +		/* Avoid ilog2(0) */
> > > > > > +		y = 0;
> > > > > > +		x = 0;
> > > > > > +	} else {
> > > > > > +		y = ilog2(val);
> > > > > > +		/* x = (val - (1 << y)) >> (y - 2); */
> > > > >
> > > > > this is some spurious development comment, can you please remove
> > > > > it?
> > > >
> > > > This is kept intentionally to help to understand the calculations.
> > >
> > > then this is confusing... Can you please expand the concept?
> > > As it is it's not understandable and I would expect someone
> > > sending a patch with title:
> > >
> > >   [PATCH] drm/xe/hwmon: Remove spurious comment
> > >
> > > Because it just looks forgotten from previous development.
> > I will add this comment inside the comment at the top of if. So it will look
> > like.
> > /*
> >  * Convert to 1.x * power(2,y)
> >  * y = ilog(val);
> >  * x = (val - (1 << y)) >> (y-2);
> >  */
>
> All right.

That comment is explaining that one line of code. I think we should just
leave it where it is, it doesn't make sense to move it above the if. How
else can we write it so that the comment doesn't look like a leftover?

If the code is clear we can remove the comment, but I think the code is
hard to understand. So try to understand the code and then you will need
the comment.

>
> > > > > > +		x = (val - (1ul << y)) << x_w >> y;
> > > > > > +	}
> > > > > > +
> > > > > > +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
> > > > > > +
> > > > > > +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > > > > +
> > > > > > +	mutex_lock(&hwmon->hwmon_lock);
> > > > > > +
> > > > > > +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
> > > > > > +			     PKG_PWR_LIM_1_TIME, rxy);
> > > > > > +
> > > > > > +	mutex_unlock(&hwmon->hwmon_lock);
> > > > >
> > > > > why are we locking here?
> > > >
> > > > Since it is rmw operation we are using lock here.
> > >
> > > OK... so what you are trying to protect here is the
> > >
> > >    read -> update -> write
> > >
> > > and it makes sense. The problem is that if this is a generic
> > > rule, which means that everyone who will do a rmw operation has
> > > to take the lock, why not take the lock directly in
> > > xe_hwmon_process_reg()?
> > >
> > > But also this can be a bit confusing, because a function is
> > > either locked or unlocked and purists might complain.
> > >
> > > A suggestion would be to do something like:
> > >
> > >     static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation operation)
> > >     {
> > >		...
> > >     }
> > >
> > >     static int xe_hwmon_reg_read(...);
> > >     {
> > >		return xe_hwmon_process_reg(..., REG_READ);
> > >     }
> > >
> > >     static int xe_hwmon_reg_write(...);
> > >     {
> > >		return xe_hwmon_process_reg(..., REG_WRITE);
> > >     }
> > >
> > >     static int xe_hwmon_reg_rmw(...);
> > >     {
> > >	int ret;
> > >
> > >	/*
> > >	 * Optional: you can check that the lock is not taken
> > >	 * to shout loud if potential deadlocks arise.
> > >	 */
> > >
> > >	/*
> > >	 * We want to protect the register update with the
> > >	 * lock blah blah blah... explanatory comment.
> > >	 */
> > >	mutex_lock(&hwmon->hwmon_lock);
> > >	ret = xe_hwmon_process_reg(..., REG_RMW);
> > >	mutex_unlock(&hwmon->hwmon_lock);
> > >
> > >	return ret;
> > >     }
> > >
> > > What do you think? It looks much clearer to me.
> >
> > REG_PKG_RAPL_LIMIT register is being written in xe_hwmon_power_max_write
> > also, that's why lock is taken. But some how while cleaning up I forgot to
> > take it in xe_hwmon_power_max_write(), thanks for catching it up. I will
> > update xe_hwmon_power_max_write() and resend series.
>
> yes... OK... then, please add the lock also in the write case.
>
> But still... thinking of hwmon running in more threads don't you
> think we might need a more generic locking? This might mean to
> lock all around xe_hwmon_process_reg()... maybe it's an overkill.
>
> For the time being I'm OK with your current solution.
>
> If you don't like my suggestion above, feel free to ignore it.

Yeah agree, might as well take the lock around the switch statement in
xe_hwmon_process_reg().

>
> Andi

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
  2023-09-25  8:58   ` Andi Shyti
@ 2023-09-27  4:45   ` Dixit, Ashutosh
  2023-09-27 10:28     ` Nilawar, Badal
  2023-09-27  4:53   ` Dixit, Ashutosh
  2023-09-28  4:55   ` Dixit, Ashutosh
  3 siblings, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-27  4:45 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>

Hi Badal,

> +static int xe_hwmon_process_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg,

Maybe xe_hwmon_read_write_reg? process_reg sounds bad. Basically we don't
process a register, we read or write it.

> +				enum xe_hwmon_reg_operation operation, u32 *value,
> +				u32 clr, u32 set)
> +{
> +	struct xe_reg reg;
> +
> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> +
> +	if (!reg.raw)
> +		return -EOPNOTSUPP;
> +
> +	switch (operation) {
> +	case REG_READ:
> +		*value = xe_mmio_read32(hwmon->gt, reg);
> +		return 0;
> +	case REG_WRITE:
> +		xe_mmio_write32(hwmon->gt, reg, *value);
> +		return 0;
> +	case REG_RMW:
> +		*value = xe_mmio_rmw32(hwmon->gt, reg, clr, set);
> +		return 0;
> +	default:
> +		drm_warn(&gt_to_xe(hwmon->gt)->drm, "Invalid xe hwmon reg operation: %d\n",
> +			 operation);
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +int xe_hwmon_process_reg_read64(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg, u64 *value)
> +{
> +	struct xe_reg reg;
> +
> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> +
> +	if (!reg.raw)
> +		return -EOPNOTSUPP;
> +
> +	*value = xe_mmio_read64_2x32(hwmon->gt, reg);
> +
> +	return 0;

We can't make read64 part of enum xe_hwmon_reg_operation?


> +}
> +
> +#define PL1_DISABLE 0
> +
> +/*
> + * HW allows arbitrary PL1 limits to be set but silently clamps these values to
> + * "typical but not guaranteed" min/max values in REG_PKG_POWER_SKU. Follow the
> + * same pattern for sysfs, allow arbitrary PL1 limits to be set but display
> + * clamped values when read.
> + */
> +static int xe_hwmon_power_max_read(struct xe_hwmon *hwmon, long *value)
> +{
> +	u32 reg_val;
> +	u64 reg_val64, min, max;
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val, 0, 0);
> +	/* Check if PL1 limit is disabled */
> +	if (!(reg_val & PKG_PWR_LIM_1_EN)) {
> +		*value = PL1_DISABLE;
> +		return 0;
> +	}
> +
> +	reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val);
> +	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
> +
> +	xe_hwmon_process_reg_read64(hwmon, REG_PKG_POWER_SKU, &reg_val64);
> +	min = REG_FIELD_GET(PKG_MIN_PWR, reg_val64);
> +	min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power);
> +	max = REG_FIELD_GET(PKG_MAX_PWR, reg_val64);
> +	max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power);
> +
> +	if (min && max)
> +		*value = clamp_t(u64, *value, min, max);

Not exactly correct. Should be:

	if (min)
		clamp at min
	if (max)
		clamp at max

I was thinking of changing it for i915 but was lazy.


> +
> +	return 0;
> +}
> +
> +static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
> +{
> +	u32 reg_val;
> +
> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
> +	if (value == PL1_DISABLE) {
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> +				     PKG_PWR_LIM_1_EN, 0);
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,

If we are not checking for return codes from these functions, why are they
not void?

Also, how about separate read/write/rmw functions as Andi was suggesting?
They would be clearer I think.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
  2023-09-25  8:58   ` Andi Shyti
  2023-09-27  4:45   ` Dixit, Ashutosh
@ 2023-09-27  4:53   ` Dixit, Ashutosh
  2023-09-27  8:39     ` Nilawar, Badal
  2023-09-28  4:55   ` Dixit, Ashutosh
  3 siblings, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-27  4:53 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>
> +static umode_t
> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> +		    u32 attr, int channel)
> +{
> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> +	int ret;
> +
> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));

Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
doesn't read/write registers.

Also do we need to take forcewake? i915 had forcewake table so it would
take forcewake automatically but XE doesn't do that.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-27  4:53   ` Dixit, Ashutosh
@ 2023-09-27  8:39     ` Nilawar, Badal
  2023-09-28  4:55       ` Dixit, Ashutosh
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-09-27  8:39 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi



On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>
>> +static umode_t
>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>> +		    u32 attr, int channel)
>> +{
>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>> +	int ret;
>> +
>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> 
> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> doesn't read/write registers.
Agreed, but visible function is called only once while registering hwmon 
interface, which happen during driver probe. During driver probe device 
will be in resumed state. So no harm in keeping 
xe_device_mem_access_get/put in visible function.
> 
> Also do we need to take forcewake? i915 had forcewake table so it would
> take forcewake automatically but XE doesn't do that.
Hwmon regs doesn't fall under GT domain so doesn't need forcewake.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-27  3:32             ` Dixit, Ashutosh
@ 2023-09-27  9:04               ` Nilawar, Badal
  2023-09-27  9:31                 ` Gupta, Anshuman
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-09-27  9:04 UTC (permalink / raw)
  To: Dixit, Ashutosh, Andi Shyti
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, riana.tauro,
	matthew.brost, rodrigo.vivi

Hi Ashutosh,

On 27-09-2023 09:02, Dixit, Ashutosh wrote:
> On Tue, 26 Sep 2023 14:01:06 -0700, Andi Shyti wrote:
>>
> 
> Hi Badal/Andi,
> 
>>
>>>>>>> +	/* val in hw units */
>>>>>>> +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_time, SF_TIME);
>>>>>>> +	/* Convert to 1.x * power(2,y) */
>>>>>>> +	if (!val) {
>>>>>>> +		/* Avoid ilog2(0) */
>>>>>>> +		y = 0;
>>>>>>> +		x = 0;
>>>>>>> +	} else {
>>>>>>> +		y = ilog2(val);
>>>>>>> +		/* x = (val - (1 << y)) >> (y - 2); */
>>>>>>
>>>>>> this is some spurious development comment, can you please remove
>>>>>> it?
>>>>>
>>>>> This is kept intentionally to help to understand the calculations.
>>>>
>>>> then this is confusing... Can you please expand the concept?
>>>> As it is it's not understandable and I would expect someone
>>>> sending a patch with title:
>>>>
>>>>    [PATCH] drm/xe/hwmon: Remove spurious comment
>>>>
>>>> Because it just looks forgotten from previous development.
>>> I will add this comment inside the comment at the top of if. So it will look
>>> like.
>>> /*
>>>   * Convert to 1.x * power(2,y)
>>>   * y = ilog(val);
>>>   * x = (val - (1 << y)) >> (y-2);
>>>   */
>>
>> All right.
> 
> That comment is explaining that one line of code. I think we should just
> leave it where it is, it doesn't make sense to move it above the if. How
> else can we write it so that the comment doesn't look like a leftover?
> 
> If the code is clear we can remove the comment, but I think the code is
> hard to understand. So try to understand the code and then you will need
> the comment.
Agreed, I will keep this comment as it is.
> 
>>
>>>>>>> +		x = (val - (1ul << y)) << x_w >> y;
>>>>>>> +	}
>>>>>>> +
>>>>>>> +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) | REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
>>>>>>> +
>>>>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>>>>> +
>>>>>>> +	mutex_lock(&hwmon->hwmon_lock);
>>>>>>> +
>>>>>>> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, (u32 *)&r,
>>>>>>> +			     PKG_PWR_LIM_1_TIME, rxy);
>>>>>>> +
>>>>>>> +	mutex_unlock(&hwmon->hwmon_lock);
>>>>>>
>>>>>> why are we locking here?
>>>>>
>>>>> Since it is rmw operation we are using lock here.
>>>>
>>>> OK... so what you are trying to protect here is the
>>>>
>>>>     read -> update -> write
>>>>
>>>> and it makes sense. The problem is that if this is a generic
>>>> rule, which means that everyone who will do a rmw operation has
>>>> to take the lock, why not take the lock directly in
>>>> xe_hwmon_process_reg()?
>>>>
>>>> But also this can be a bit confusing, because a function is
>>>> either locked or unlocked and purists might complain.
>>>>
>>>> A suggestion would be to do something like:
>>>>
>>>>      static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation operation)
>>>>      {
>>>> 		...
>>>>      }
>>>>
>>>>      static int xe_hwmon_reg_read(...);
>>>>      {
>>>> 		return xe_hwmon_process_reg(..., REG_READ);
>>>>      }
>>>>
>>>>      static int xe_hwmon_reg_write(...);
>>>>      {
>>>> 		return xe_hwmon_process_reg(..., REG_WRITE);
>>>>      }
>>>>
>>>>      static int xe_hwmon_reg_rmw(...);
>>>>      {
>>>> 	int ret;
>>>>
>>>> 	/*
>>>> 	 * Optional: you can check that the lock is not taken
>>>> 	 * to shout loud if potential deadlocks arise.
>>>> 	 */
>>>>
>>>> 	/*
>>>> 	 * We want to protect the register update with the
>>>> 	 * lock blah blah blah... explanatory comment.
>>>> 	 */
>>>> 	mutex_lock(&hwmon->hwmon_lock);
>>>> 	ret = xe_hwmon_process_reg(..., REG_RMW);
>>>> 	mutex_unlock(&hwmon->hwmon_lock);
>>>>
>>>> 	return ret;
>>>>      }
>>>>
>>>> What do you think? It looks much clearer to me.
>>>
>>> REG_PKG_RAPL_LIMIT register is being written in xe_hwmon_power_max_write
>>> also, that's why lock is taken. But some how while cleaning up I forgot to
>>> take it in xe_hwmon_power_max_write(), thanks for catching it up. I will
>>> update xe_hwmon_power_max_write() and resend series.
>>
>> yes... OK... then, please add the lock also in the write case.
>>
>> But still... thinking of hwmon running in more threads don't you
>> think we might need a more generic locking? This might mean to
>> lock all around xe_hwmon_process_reg()... maybe it's an overkill.
>>
>> For the time being I'm OK with your current solution.
>>
>> If you don't like my suggestion above, feel free to ignore it.
> 
> Yeah agree, might as well take the lock around the switch statement in
> xe_hwmon_process_reg().
Will there be a possibility where two different threads will access 
power attributes power1_max and power1_max_interval simultaneously and 
frequently. I am not able to think such scenario. If not then we can 
remove lock from here.

Regards.
Badal
> 
>>
>> Andi
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
  2023-09-27  9:04               ` Nilawar, Badal
@ 2023-09-27  9:31                 ` Gupta, Anshuman
  0 siblings, 0 replies; 32+ messages in thread
From: Gupta, Anshuman @ 2023-09-27  9:31 UTC (permalink / raw)
  To: Nilawar, Badal, Dixit, Ashutosh, Andi Shyti
  Cc: intel-xe@lists.freedesktop.org, linux-hwmon@vger.kernel.org,
	linux@roeck-us.net, Tauro, Riana, Brost, Matthew, Vivi, Rodrigo



> -----Original Message-----
> From: Nilawar, Badal <badal.nilawar@intel.com>
> Sent: Wednesday, September 27, 2023 2:35 PM
> To: Dixit, Ashutosh <ashutosh.dixit@intel.com>; Andi Shyti
> <andi.shyti@linux.intel.com>
> Cc: intel-xe@lists.freedesktop.org; linux-hwmon@vger.kernel.org; Gupta,
> Anshuman <anshuman.gupta@intel.com>; linux@roeck-us.net; Tauro, Riana
> <riana.tauro@intel.com>; Brost, Matthew <matthew.brost@intel.com>; Vivi,
> Rodrigo <rodrigo.vivi@intel.com>
> Subject: Re: [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval
> 
> Hi Ashutosh,
> 
> On 27-09-2023 09:02, Dixit, Ashutosh wrote:
> > On Tue, 26 Sep 2023 14:01:06 -0700, Andi Shyti wrote:
> >>
> >
> > Hi Badal/Andi,
> >
> >>
> >>>>>>> +	/* val in hw units */
> >>>>>>> +	val = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon-
> >scl_shift_time, SF_TIME);
> >>>>>>> +	/* Convert to 1.x * power(2,y) */
> >>>>>>> +	if (!val) {
> >>>>>>> +		/* Avoid ilog2(0) */
> >>>>>>> +		y = 0;
> >>>>>>> +		x = 0;
> >>>>>>> +	} else {
> >>>>>>> +		y = ilog2(val);
> >>>>>>> +		/* x = (val - (1 << y)) >> (y - 2); */
> >>>>>>
> >>>>>> this is some spurious development comment, can you please remove
> >>>>>> it?
> >>>>>
> >>>>> This is kept intentionally to help to understand the calculations.
> >>>>
> >>>> then this is confusing... Can you please expand the concept?
> >>>> As it is it's not understandable and I would expect someone sending
> >>>> a patch with title:
> >>>>
> >>>>    [PATCH] drm/xe/hwmon: Remove spurious comment
> >>>>
> >>>> Because it just looks forgotten from previous development.
> >>> I will add this comment inside the comment at the top of if. So it
> >>> will look like.
> >>> /*
> >>>   * Convert to 1.x * power(2,y)
> >>>   * y = ilog(val);
> >>>   * x = (val - (1 << y)) >> (y-2);
> >>>   */
> >>
> >> All right.
> >
> > That comment is explaining that one line of code. I think we should
> > just leave it where it is, it doesn't make sense to move it above the
> > if. How else can we write it so that the comment doesn't look like a leftover?
> >
> > If the code is clear we can remove the comment, but I think the code
> > is hard to understand. So try to understand the code and then you will
> > need the comment.
> Agreed, I will keep this comment as it is.
> >
> >>
> >>>>>>> +		x = (val - (1ul << y)) << x_w >> y;
> >>>>>>> +	}
> >>>>>>> +
> >>>>>>> +	rxy = REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_X, x) |
> >>>>>>> +REG_FIELD_PREP(PKG_PWR_LIM_1_TIME_Y, y);
> >>>>>>> +
> >>>>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> >>>>>>> +
> >>>>>>> +	mutex_lock(&hwmon->hwmon_lock);
> >>>>>>> +
> >>>>>>> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT,
> REG_RMW, (u32 *)&r,
> >>>>>>> +			     PKG_PWR_LIM_1_TIME, rxy);
> >>>>>>> +
> >>>>>>> +	mutex_unlock(&hwmon->hwmon_lock);
> >>>>>>
> >>>>>> why are we locking here?
> >>>>>
> >>>>> Since it is rmw operation we are using lock here.
> >>>>
> >>>> OK... so what you are trying to protect here is the
> >>>>
> >>>>     read -> update -> write
> >>>>
> >>>> and it makes sense. The problem is that if this is a generic rule,
> >>>> which means that everyone who will do a rmw operation has to take
> >>>> the lock, why not take the lock directly in xe_hwmon_process_reg()?
> >>>>
> >>>> But also this can be a bit confusing, because a function is either
> >>>> locked or unlocked and purists might complain.
> >>>>
> >>>> A suggestion would be to do something like:
> >>>>
> >>>>      static int xe_hwmon_process_reg(..., enum xe_hwmon_reg_operation
> operation)
> >>>>      {
> >>>> 		...
> >>>>      }
> >>>>
> >>>>      static int xe_hwmon_reg_read(...);
> >>>>      {
> >>>> 		return xe_hwmon_process_reg(..., REG_READ);
> >>>>      }
> >>>>
> >>>>      static int xe_hwmon_reg_write(...);
> >>>>      {
> >>>> 		return xe_hwmon_process_reg(..., REG_WRITE);
> >>>>      }
> >>>>
> >>>>      static int xe_hwmon_reg_rmw(...);
> >>>>      {
> >>>> 	int ret;
> >>>>
> >>>> 	/*
> >>>> 	 * Optional: you can check that the lock is not taken
> >>>> 	 * to shout loud if potential deadlocks arise.
> >>>> 	 */
> >>>>
> >>>> 	/*
> >>>> 	 * We want to protect the register update with the
> >>>> 	 * lock blah blah blah... explanatory comment.
> >>>> 	 */
> >>>> 	mutex_lock(&hwmon->hwmon_lock);
> >>>> 	ret = xe_hwmon_process_reg(..., REG_RMW);
> >>>> 	mutex_unlock(&hwmon->hwmon_lock);
> >>>>
> >>>> 	return ret;
> >>>>      }
> >>>>
> >>>> What do you think? It looks much clearer to me.
> >>>
> >>> REG_PKG_RAPL_LIMIT register is being written in
> >>> xe_hwmon_power_max_write also, that's why lock is taken. But some
> >>> how while cleaning up I forgot to take it in
> >>> xe_hwmon_power_max_write(), thanks for catching it up. I will update
> xe_hwmon_power_max_write() and resend series.
> >>
> >> yes... OK... then, please add the lock also in the write case.
> >>
> >> But still... thinking of hwmon running in more threads don't you
> >> think we might need a more generic locking? This might mean to lock
> >> all around xe_hwmon_process_reg()... maybe it's an overkill.
> >>
> >> For the time being I'm OK with your current solution.
> >>
> >> If you don't like my suggestion above, feel free to ignore it.
> >
> > Yeah agree, might as well take the lock around the switch statement in
> > xe_hwmon_process_reg().
> Will there be a possibility where two different threads will access power
> attributes power1_max and power1_max_interval simultaneously and
> frequently. I am not able to think such scenario. If not then we can remove
> lock from here.
There are read and write cases, as far as I can see the seq_read_iter always takes seq_file->lock
So read cases like hwm_energy won't need any lock in my opinion, we are protected by above sysfs layer.
https://elixir.bootlin.com/linux/latest/source/fs/seq_file.c#L171
But seq_write on another hand does not use any lock, so I also fees for any ATTR does any read/write operation
on REG_PKG_RAPL_LIMIT register need a lock here.
Thanks,
Anshuman Gupta.

> 
> Regards.
> Badal
> >
> >>
> >> Andi
> >
> > Thanks.
> > --
> > Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-27  4:45   ` Dixit, Ashutosh
@ 2023-09-27 10:28     ` Nilawar, Badal
  2023-09-28  4:54       ` Dixit, Ashutosh
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-09-27 10:28 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

Hi Ashutosh,

On 27-09-2023 10:15, Dixit, Ashutosh wrote:
> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>
> 
> Hi Badal,
> 
>> +static int xe_hwmon_process_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg,
> 
> Maybe xe_hwmon_read_write_reg? process_reg sounds bad. Basically we don't
> process a register, we read or write it.
I don't think it sound that bad. When we say process register apart from 
read/write/rmw what else we will be doing. I think lets not rename this 
function.
> 
>> +				enum xe_hwmon_reg_operation operation, u32 *value,
>> +				u32 clr, u32 set)
>> +{
>> +	struct xe_reg reg;
>> +
>> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
>> +
>> +	if (!reg.raw)
>> +		return -EOPNOTSUPP;
>> +
>> +	switch (operation) {
>> +	case REG_READ:
>> +		*value = xe_mmio_read32(hwmon->gt, reg);
>> +		return 0;
>> +	case REG_WRITE:
>> +		xe_mmio_write32(hwmon->gt, reg, *value);
>> +		return 0;
>> +	case REG_RMW:
>> +		*value = xe_mmio_rmw32(hwmon->gt, reg, clr, set);
>> +		return 0;
>> +	default:
>> +		drm_warn(&gt_to_xe(hwmon->gt)->drm, "Invalid xe hwmon reg operation: %d\n",
>> +			 operation);
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +int xe_hwmon_process_reg_read64(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg, u64 *value)
>> +{
>> +	struct xe_reg reg;
>> +
>> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
>> +
>> +	if (!reg.raw)
>> +		return -EOPNOTSUPP;
>> +
>> +	*value = xe_mmio_read64_2x32(hwmon->gt, reg);
>> +
>> +	return 0;
> 
> We can't make read64 part of enum xe_hwmon_reg_operation?
read64 takes argument "u64 *value" so kept it separate.
> 
> 
>> +}
>> +
>> +#define PL1_DISABLE 0
>> +
>> +/*
>> + * HW allows arbitrary PL1 limits to be set but silently clamps these values to
>> + * "typical but not guaranteed" min/max values in REG_PKG_POWER_SKU. Follow the
>> + * same pattern for sysfs, allow arbitrary PL1 limits to be set but display
>> + * clamped values when read.
>> + */
>> +static int xe_hwmon_power_max_read(struct xe_hwmon *hwmon, long *value)
>> +{
>> +	u32 reg_val;
>> +	u64 reg_val64, min, max;
>> +
>> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val, 0, 0);
>> +	/* Check if PL1 limit is disabled */
>> +	if (!(reg_val & PKG_PWR_LIM_1_EN)) {
>> +		*value = PL1_DISABLE;
>> +		return 0;
>> +	}
>> +
>> +	reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val);
>> +	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
>> +
>> +	xe_hwmon_process_reg_read64(hwmon, REG_PKG_POWER_SKU, &reg_val64);
>> +	min = REG_FIELD_GET(PKG_MIN_PWR, reg_val64);
>> +	min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power);
>> +	max = REG_FIELD_GET(PKG_MAX_PWR, reg_val64);
>> +	max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power);
>> +
>> +	if (min && max)
>> +		*value = clamp_t(u64, *value, min, max);
> 
> Not exactly correct. Should be:
> 
> 	if (min)
> 		clamp at min
> 	if (max)
> 		clamp at max
> 
> I was thinking of changing it for i915 but was lazy.
Sure, thanks for pointing this.
> 
> 
>> +
>> +	return 0;
>> +}
>> +
>> +static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
>> +{
>> +	u32 reg_val;
>> +
>> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
>> +	if (value == PL1_DISABLE) {
>> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
>> +				     PKG_PWR_LIM_1_EN, 0);
>> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
> 
> If we are not checking for return codes from these functions, why are they
> not void?
Top level functions expect return. For function xe_hwmon_power_max_write 
returning error if PL1 disable not possible. The functions 
xe_hwmon_power_max_read/xe_hwmon_power_rated_max_read can be made void, 
then it will look like. What difference its going to make? I feel 
existing approach is much readable.

case hwmon_power_max:
          xe_hwmon_power_max_read(hwmon, val);
	 return 0;
case hwmon_power_rated_max:
          xe_hwmon_power_rated_max_read(hwmon, val);
	 return 0;
> 
> Also, how about separate read/write/rmw functions as Andi was suggesting?
> They would be clearer I think.
Would not prefer to add further abstraction, lets keep as is. Going 
further while adding new platforms will think about adding it.

Regards,
Badal
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-27 10:28     ` Nilawar, Badal
@ 2023-09-28  4:54       ` Dixit, Ashutosh
  0 siblings, 0 replies; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-28  4:54 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Wed, 27 Sep 2023 03:28:51 -0700, Nilawar, Badal wrote:
>
> Hi Ashutosh,
>
> On 27-09-2023 10:15, Dixit, Ashutosh wrote:
> > On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> >>
> >
> > Hi Badal,
> >
> >> +static int xe_hwmon_process_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg,
> >
> > Maybe xe_hwmon_read_write_reg? process_reg sounds bad. Basically we don't
> > process a register, we read or write it.
> I don't think it sound that bad. When we say process register apart from
> read/write/rmw what else we will be doing. I think lets not rename this
> function.

OK, maybe leave as is (though another option is xe_hwmon_operate_reg since
we already have xe_hwmon_reg_op, or xe_hwmon_rw_reg).

> >
> >> +				enum xe_hwmon_reg_operation operation, u32 *value,
> >> +				u32 clr, u32 set)
> >> +{
> >> +	struct xe_reg reg;
> >> +
> >> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> >> +
> >> +	if (!reg.raw)
> >> +		return -EOPNOTSUPP;
> >> +
> >> +	switch (operation) {
> >> +	case REG_READ:
> >> +		*value = xe_mmio_read32(hwmon->gt, reg);
> >> +		return 0;
> >> +	case REG_WRITE:
> >> +		xe_mmio_write32(hwmon->gt, reg, *value);
> >> +		return 0;
> >> +	case REG_RMW:
> >> +		*value = xe_mmio_rmw32(hwmon->gt, reg, clr, set);
> >> +		return 0;
> >> +	default:
> >> +		drm_warn(&gt_to_xe(hwmon->gt)->drm, "Invalid xe hwmon reg operation: %d\n",
> >> +			 operation);
> >> +		return -EOPNOTSUPP;
> >> +	}
> >> +}
> >> +
> >> +int xe_hwmon_process_reg_read64(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg, u64 *value)
> >> +{
> >> +	struct xe_reg reg;
> >> +
> >> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> >> +
> >> +	if (!reg.raw)
> >> +		return -EOPNOTSUPP;
> >> +
> >> +	*value = xe_mmio_read64_2x32(hwmon->gt, reg);
> >> +
> >> +	return 0;
> >
> > We can't make read64 part of enum xe_hwmon_reg_operation?
> read64 takes argument "u64 *value" so kept it separate.

OK, makes sense.

> >
> >
> >> +}
> >> +
> >> +#define PL1_DISABLE 0
> >> +
> >> +/*
> >> + * HW allows arbitrary PL1 limits to be set but silently clamps these values to
> >> + * "typical but not guaranteed" min/max values in REG_PKG_POWER_SKU. Follow the
> >> + * same pattern for sysfs, allow arbitrary PL1 limits to be set but display
> >> + * clamped values when read.
> >> + */
> >> +static int xe_hwmon_power_max_read(struct xe_hwmon *hwmon, long *value)
> >> +{
> >> +	u32 reg_val;
> >> +	u64 reg_val64, min, max;
> >> +
> >> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val, 0, 0);
> >> +	/* Check if PL1 limit is disabled */
> >> +	if (!(reg_val & PKG_PWR_LIM_1_EN)) {
> >> +		*value = PL1_DISABLE;
> >> +		return 0;
> >> +	}
> >> +
> >> +	reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val);
> >> +	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
> >> +
> >> +	xe_hwmon_process_reg_read64(hwmon, REG_PKG_POWER_SKU, &reg_val64);
> >> +	min = REG_FIELD_GET(PKG_MIN_PWR, reg_val64);
> >> +	min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power);
> >> +	max = REG_FIELD_GET(PKG_MAX_PWR, reg_val64);
> >> +	max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power);
> >> +
> >> +	if (min && max)
> >> +		*value = clamp_t(u64, *value, min, max);
> >
> > Not exactly correct. Should be:
> >
> >	if (min)
> >		clamp at min
> >	if (max)
> >		clamp at max
> >
> > I was thinking of changing it for i915 but was lazy.
> Sure, thanks for pointing this.
> >
> >
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
> >> +{
> >> +	u32 reg_val;
> >> +
> >> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
> >> +	if (value == PL1_DISABLE) {
> >> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> >> +				     PKG_PWR_LIM_1_EN, 0);
> >> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
> >
> > If we are not checking for return codes from these functions, why are they
> > not void?
> Top level functions expect return. For function xe_hwmon_power_max_write
> returning error if PL1 disable not possible. The functions
> xe_hwmon_power_max_read/xe_hwmon_power_rated_max_read can be made void,
> then it will look like. What difference its going to make? I feel existing
> approach is much readable.

As I have pointed out in the other mail it is not. It raises more questions
about why the return code is not being checked, whether the function can
return an error. So it is better to be crisp as to what can actually happen.

>
> case hwmon_power_max:
>          xe_hwmon_power_max_read(hwmon, val);
>	 return 0;
> case hwmon_power_rated_max:
>          xe_hwmon_power_rated_max_read(hwmon, val);
>	 return 0;

This is fine.

> >
> > Also, how about separate read/write/rmw functions as Andi was suggesting?
> > They would be clearer I think.
> Would not prefer to add further abstraction, lets keep as is. Going further
> while adding new platforms will think about adding it.

OK, no need to add wrappers.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-27  8:39     ` Nilawar, Badal
@ 2023-09-28  4:55       ` Dixit, Ashutosh
  2023-09-29  6:37         ` Nilawar, Badal
  0 siblings, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-28  4:55 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>

Hi Badal,

> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> > On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> >>
> >> +static umode_t
> >> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> >> +		    u32 attr, int channel)
> >> +{
> >> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> >> +	int ret;
> >> +
> >> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> >
> > Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> > is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> > doesn't read/write registers.
> Agreed, but visible function is called only once while registering hwmon
> interface, which happen during driver probe. During driver probe device
> will be in resumed state. So no harm in keeping
> xe_device_mem_access_get/put in visible function.

To me it doesn't make any sense to keep xe_device_mem_access_get/put
anywhere except in xe_hwmon_process_reg where the HW access actually
happens. We can eliminate xe_device_mem_access_get/put's all over the place
if we do it. Isn't it?

The only restriction I have heard of (though not sure why) is that
xe_device_mem_access_get/put should not be called under lock. Though I am
not sure it is for spinlock or also mutex. So as we were saying the locking
will also need to move to xe_hwmon_process_reg.

So:

xe_hwmon_process_reg()
{
	xe_device_mem_access_get
	mutex_lock
	...
	mutex_unlock
	xe_device_mem_access_put
}

So once again if this is not possible for some reason let's figure out why.

> >
> > Also do we need to take forcewake? i915 had forcewake table so it would
> > take forcewake automatically but XE doesn't do that.
> Hwmon regs doesn't fall under GT domain so doesn't need forcewake.

OK, great.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
                     ` (2 preceding siblings ...)
  2023-09-27  4:53   ` Dixit, Ashutosh
@ 2023-09-28  4:55   ` Dixit, Ashutosh
  3 siblings, 0 replies; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-28  4:55 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>

Hi Badal,

Here's how I think this we should change this patch.

> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> new file mode 100644
> index 000000000000..44d814e111c6
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> @@ -0,0 +1,357 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include <linux/hwmon.h>
> +
> +#include <drm/drm_managed.h>
> +#include "regs/xe_gt_regs.h"
> +#include "regs/xe_mchbar_regs.h"
> +#include "xe_device.h"
> +#include "xe_gt.h"
> +#include "xe_hwmon.h"
> +#include "xe_mmio.h"
> +
> +enum xe_hwmon_reg {
> +	REG_PKG_RAPL_LIMIT,
> +	REG_PKG_POWER_SKU,
> +	REG_PKG_POWER_SKU_UNIT,
> +};
> +
> +enum xe_hwmon_reg_operation {

enum xe_hwmon_reg_op

> +	REG_READ,
> +	REG_WRITE,
> +	REG_RMW,
> +};
> +
> +/*
> + * SF_* - scale factors for particular quantities according to hwmon spec.
> + */
> +#define SF_POWER	1000000		/* microwatts */
> +
> +struct xe_hwmon {
> +	struct device *hwmon_dev;
> +	struct xe_gt *gt;
> +	struct mutex hwmon_lock; /* rmw operations*/
> +	int scl_shift_power;
> +};
> +
> +static u32 xe_hwmon_get_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg)

Return 'struct xe_reg' from here. Caller does .raw if needed, so caller can
do xe_hwmon_get_reg(...).raw to check when needed.

So basically this function can return a NULL register if say a particular
register does not exist on a platorm.

Also this is the function which should be called from the is_visible()
functions (as has already been done) (so if this function returns NULL the
sysfs entries will not be visible). This allows for other functions to be
void.

> +{
> +	struct xe_device *xe = gt_to_xe(hwmon->gt);
> +	struct xe_reg reg = XE_REG(0);
> +
> +	switch (hwmon_reg) {
> +	case REG_PKG_RAPL_LIMIT:
> +		if (xe->info.platform == XE_DG2)
> +			reg = PCU_CR_PACKAGE_RAPL_LIMIT;
> +		else if (xe->info.platform == XE_PVC)
> +			reg = PVC_GT0_PACKAGE_RAPL_LIMIT;
> +		break;
> +	case REG_PKG_POWER_SKU:
> +		if (xe->info.platform == XE_DG2)
> +			reg = PCU_CR_PACKAGE_POWER_SKU;
> +		else if (xe->info.platform == XE_PVC)
> +			reg = PVC_GT0_PACKAGE_POWER_SKU;
> +		break;
> +	case REG_PKG_POWER_SKU_UNIT:
> +		if (xe->info.platform == XE_DG2)
> +			reg = PCU_CR_PACKAGE_POWER_SKU_UNIT;
> +		else if (xe->info.platform == XE_PVC)
> +			reg = PVC_GT0_PACKAGE_POWER_SKU_UNIT;
> +		break;
> +	default:
> +		drm_warn(&xe->drm, "Unknown xe hwmon reg id: %d\n", hwmon_reg);
> +		break;
> +	}
> +
> +	return reg.raw;
> +}
> +
> +static int xe_hwmon_process_reg(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg,

Should be void. As described above this should never get called if a
particular register does not exist because the sysfs entries will not be
visible.

> +				enum xe_hwmon_reg_operation operation, u32 *value,
> +				u32 clr, u32 set)

I would also make the 'op' the second argument, so it is a little bit
easier to see if we are reading or writing (as I said elsewhere we can skip
adding read/write wrappers).

> +{
> +	struct xe_reg reg;
> +
> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> +
> +	if (!reg.raw)
> +		return -EOPNOTSUPP;

If register doesn't exist is a WARN_ON.

> +
> +	switch (operation) {
> +	case REG_READ:
> +		*value = xe_mmio_read32(hwmon->gt, reg);
> +		return 0;
> +	case REG_WRITE:
> +		xe_mmio_write32(hwmon->gt, reg, *value);
> +		return 0;
> +	case REG_RMW:
> +		*value = xe_mmio_rmw32(hwmon->gt, reg, clr, set);
> +		return 0;
> +	default:
> +		drm_warn(&gt_to_xe(hwmon->gt)->drm, "Invalid xe hwmon reg operation: %d\n",
> +			 operation);
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +int xe_hwmon_process_reg_read64(struct xe_hwmon *hwmon, enum xe_hwmon_reg hwmon_reg, u64 *value)

Why not just xe_hwmon_reg_read64?

> +{
> +	struct xe_reg reg;
> +
> +	reg.raw = xe_hwmon_get_reg(hwmon, hwmon_reg);
> +
> +	if (!reg.raw)
> +		return -EOPNOTSUPP;
> +
> +	*value = xe_mmio_read64_2x32(hwmon->gt, reg);
> +
> +	return 0;

Again should be void, for the same reason as xe_hwmon_process_reg.

> +}
> +
> +#define PL1_DISABLE 0
> +
> +/*
> + * HW allows arbitrary PL1 limits to be set but silently clamps these values to
> + * "typical but not guaranteed" min/max values in REG_PKG_POWER_SKU. Follow the
> + * same pattern for sysfs, allow arbitrary PL1 limits to be set but display
> + * clamped values when read.
> + */
> +static int xe_hwmon_power_max_read(struct xe_hwmon *hwmon, long *value)
> +{
> +	u32 reg_val;
> +	u64 reg_val64, min, max;
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val, 0, 0);
> +	/* Check if PL1 limit is disabled */
> +	if (!(reg_val & PKG_PWR_LIM_1_EN)) {
> +		*value = PL1_DISABLE;
> +		return 0;
> +	}
> +
> +	reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val);
> +	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
> +
> +	xe_hwmon_process_reg_read64(hwmon, REG_PKG_POWER_SKU, &reg_val64);
> +	min = REG_FIELD_GET(PKG_MIN_PWR, reg_val64);
> +	min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power);
> +	max = REG_FIELD_GET(PKG_MAX_PWR, reg_val64);
> +	max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power);
> +
> +	if (min && max)
> +		*value = clamp_t(u64, *value, min, max);
> +
> +	return 0;
> +}

Should be void.

> +
> +static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
> +{
> +	u32 reg_val;
> +
> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
> +	if (value == PL1_DISABLE) {
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> +				     PKG_PWR_LIM_1_EN, 0);
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
> +				     PKG_PWR_LIM_1_EN, 0);
> +
> +		if (reg_val & PKG_PWR_LIM_1_EN)
> +			return -EOPNOTSUPP;

This function cannot be void since we return an error here, so it's fine.

> +	}
> +
> +	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> +	reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER);
> +	reg_val = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, reg_val);
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> +			     PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, reg_val);
> +
> +	return 0;
> +}
> +
> +static int xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, long *value)
> +{
> +	u32 reg_val;
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_POWER_SKU, REG_READ, &reg_val, 0, 0);
> +	reg_val = REG_FIELD_GET(PKG_TDP, reg_val);
> +	*value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power);
> +
> +	return 0;
> +}

Should be void.

> +
> +static const struct hwmon_channel_info *hwmon_info[] = {
> +	HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX),
> +	NULL
> +};
> +
> +static umode_t
> +xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
> +{
> +	switch (attr) {
> +	case hwmon_power_max:
> +		return xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT) ? 0664 : 0;
> +	case hwmon_power_rated_max:
> +		return xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU) ? 0444 : 0;
> +	default:
> +		return 0;
> +	}
> +}

This is fine.

> +
> +static int
> +xe_hwmon_power_read(struct xe_hwmon *hwmon, u32 attr, int chan, long *val)
> +{
> +	switch (attr) {
> +	case hwmon_power_max:
> +		return xe_hwmon_power_max_read(hwmon, val);
> +	case hwmon_power_rated_max:
> +		return xe_hwmon_power_rated_max_read(hwmon, val);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}

Fine, just take care of void returns.

> +
> +static int
> +xe_hwmon_power_write(struct xe_hwmon *hwmon, u32 attr, int chan, long val)
> +{
> +	switch (attr) {
> +	case hwmon_power_max:
> +		return xe_hwmon_power_max_write(hwmon, val);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static umode_t
> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> +		    u32 attr, int channel)
> +{
> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> +	int ret;
> +
> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));

Let's move xe_device_mem_access_get() to xe_hwmon_process_reg().

> +
> +	switch (type) {
> +	case hwmon_power:
> +		ret = xe_hwmon_power_is_visible(hwmon, attr, channel);
> +		break;
> +	default:
> +		ret = 0;
> +		break;

return 0;

> +	}
> +
> +	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
> +
> +	return ret;
> +}
> +
> +static int
> +xe_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> +	      int channel, long *val)
> +{
> +	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
> +	int ret;
> +
> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));

Move xe_device_mem_access_get() to xe_hwmon_process_reg().

> +
> +	switch (type) {
> +	case hwmon_power:
> +		ret = xe_hwmon_power_read(hwmon, attr, channel, val);
> +		break;
> +	default:
> +		ret = -EOPNOTSUPP;
> +		break;

return -EOPNOTSUPP;


> +	}
> +
> +	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
> +
> +	return ret;
> +}
> +
> +static int
> +xe_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> +	       int channel, long val)
> +{
> +	struct xe_hwmon *hwmon = dev_get_drvdata(dev);
> +	int ret;
> +
> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));

Move xe_device_mem_access_get() to xe_hwmon_process_reg().

> +
> +	switch (type) {
> +	case hwmon_power:
> +		ret = xe_hwmon_power_write(hwmon, attr, channel, val);
> +		break;
> +	default:
> +		ret = -EOPNOTSUPP;
> +		break;

return -EOPNOTSUPP;

> +	}
> +
> +	xe_device_mem_access_put(gt_to_xe(hwmon->gt));
> +
> +	return ret;
> +}
> +
> +static const struct hwmon_ops hwmon_ops = {
> +	.is_visible = xe_hwmon_is_visible,
> +	.read = xe_hwmon_read,
> +	.write = xe_hwmon_write,
> +};
> +
> +static const struct hwmon_chip_info hwmon_chip_info = {
> +	.ops = &hwmon_ops,
> +	.info = hwmon_info,
> +};
> +
> +static void
> +xe_hwmon_get_preregistration_info(struct xe_device *xe)
> +{
> +	struct xe_hwmon *hwmon = xe->hwmon;
> +	u32 val_sku_unit = 0;
> +	int ret;
> +
> +	ret = xe_hwmon_process_reg(hwmon, REG_PKG_POWER_SKU_UNIT, REG_READ, &val_sku_unit, 0, 0);
> +	/*
> +	 * The contents of register PKG_POWER_SKU_UNIT do not change,
> +	 * so read it once and store the shift values.
> +	 */
> +	if (!ret)
> +		hwmon->scl_shift_power = REG_FIELD_GET(PKG_PWR_UNIT, val_sku_unit);

if xe_hwmon_is_visible(... hwmon_power ...) {
	xe_hwmon_process_reg();
	hwmon->scl_shift_power = REG_FIELD_GET(PKG_PWR_UNIT, val_sku_unit);
}

Please let me know if any of this is not possible. I will look at the other
patches after you respond, though it looks like they will also need almost
identical changes.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-28  4:55       ` Dixit, Ashutosh
@ 2023-09-29  6:37         ` Nilawar, Badal
  2023-09-29 16:48           ` Dixit, Ashutosh
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-09-29  6:37 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi



On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>>
> 
> Hi Badal,
> 
>> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
>>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>>>
>>>> +static umode_t
>>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>>>> +		    u32 attr, int channel)
>>>> +{
>>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>>>> +	int ret;
>>>> +
>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>
>>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
>>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
>>> doesn't read/write registers.
>> Agreed, but visible function is called only once while registering hwmon
>> interface, which happen during driver probe. During driver probe device
>> will be in resumed state. So no harm in keeping
>> xe_device_mem_access_get/put in visible function.
> 
> To me it doesn't make any sense to keep xe_device_mem_access_get/put
> anywhere except in xe_hwmon_process_reg where the HW access actually
> happens. We can eliminate xe_device_mem_access_get/put's all over the place
> if we do it. Isn't it?
Agreed, thought process here suggest that take rpm wakeref at lowest 
possible level. I already tried this in rfc series and in some extent in 
rev2. There is problem with this approach. See my comments below.
> 
> The only restriction I have heard of (though not sure why) is that
> xe_device_mem_access_get/put should not be called under lock. Though I am
> not sure it is for spinlock or also mutex. So as we were saying the locking
> will also need to move to xe_hwmon_process_reg.
Yes from rev2 comments its dangerous to take mutex before 
xe_device_mem_access_get/put. With code for "PL1 disable/restore during 
resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> 
rpm resume -> disable pl1 -> mutex lock (dead lock here).
> 
> So:
> 
> xe_hwmon_process_reg()
> {
> 	xe_device_mem_access_get
> 	mutex_lock
> 	...
> 	mutex_unlock
> 	xe_device_mem_access_put
> }
> 
> So once again if this is not possible for some reason let's figure out why.
There are two problems with this approach.

Problem 1: If you see implementation of xe_hwmon_power_max_write, reg 
access is happening 3 times, so there will be 3 rpm suspend/resume 
cycles. I was observing the same with rfc implementation. So in 
subsequent series xe_device_mem_access_put/get is moved to top level 
functions i.e. hwmon hooks.

Problem 2: If locking moved inside xe_hwmon_process_reg then between two 
subsequent reg accesses it will open small window during which race can 
happen.
As Anshuman suggested in other thread for read are sequential and 
protected by sysfs layer. So lets apply locking only for RW attributes.


+static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
+{
+	u32 reg_val;
+
+	/* Disable PL1 limit and verify, as limit cannot be disabled on all 
platforms */
+	if (value == PL1_DISABLE) {
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+
+		if (reg_val & PKG_PWR_LIM_1_EN)
+			return -EOPNOTSUPP;
+	}
+
+	/* Computation in 64-bits to avoid overflow. Round to nearest. */
+	reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, 
SF_POWER);
+	reg_val = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, reg_val);
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+			     PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, reg_val);
+
+	return 0;
+}

Regards,
Badal
> 
>>>
>>> Also do we need to take forcewake? i915 had forcewake table so it would
>>> take forcewake automatically but XE doesn't do that.
>> Hwmon regs doesn't fall under GT domain so doesn't need forcewake.
> 
> OK, great.
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-29  6:37         ` Nilawar, Badal
@ 2023-09-29 16:48           ` Dixit, Ashutosh
  2023-09-29 21:41             ` Dixit, Ashutosh
  0 siblings, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-29 16:48 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
>

Hi Badal,

> On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> > On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
> >
> >> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> >>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> >>>>
> >>>> +static umode_t
> >>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> >>>> +		    u32 attr, int channel)
> >>>> +{
> >>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> >>>> +	int ret;
> >>>> +
> >>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> >>>
> >>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> >>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> >>> doesn't read/write registers.
> >> Agreed, but visible function is called only once while registering hwmon
> >> interface, which happen during driver probe. During driver probe device
> >> will be in resumed state. So no harm in keeping
> >> xe_device_mem_access_get/put in visible function.
> >
> > To me it doesn't make any sense to keep xe_device_mem_access_get/put
> > anywhere except in xe_hwmon_process_reg where the HW access actually
> > happens. We can eliminate xe_device_mem_access_get/put's all over the place
> > if we do it. Isn't it?
> Agreed, thought process here suggest that take rpm wakeref at lowest
> possible level. I already tried this in rfc series and in some extent in
> rev2. There is problem with this approach. See my comments below.
> >
> > The only restriction I have heard of (though not sure why) is that
> > xe_device_mem_access_get/put should not be called under lock. Though I am
> > not sure it is for spinlock or also mutex. So as we were saying the locking
> > will also need to move to xe_hwmon_process_reg.
> Yes from rev2 comments its dangerous to take mutex before
> xe_device_mem_access_get/put. With code for "PL1 disable/restore during
> resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
> resume -> disable pl1 -> mutex lock (dead lock here).

But this is already the wrong order as mentioned below. If we follow the
below order do we still see deadlock?

> >
> > So:
> >
> > xe_hwmon_process_reg()
> > {
> >	xe_device_mem_access_get
> >	mutex_lock
> >	...
> >	mutex_unlock
> >	xe_device_mem_access_put
> > }
> >
> > So once again if this is not possible for some reason let's figure out why.
> There are two problems with this approach.
>
> Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
> access is happening 3 times, so there will be 3 rpm suspend/resume
> cycles. I was observing the same with rfc implementation. So in subsequent
> series xe_device_mem_access_put/get is moved to top level functions
> i.e. hwmon hooks.

This is not exactly correct because there is also a 1 second autosuspend
delay which will prevent such rpm suspend/resume cycles:

xe_pm_runtime_init:
	pm_runtime_set_autosuspend_delay(dev, 1000);


>
> Problem 2: If locking moved inside xe_hwmon_process_reg then between two
> subsequent reg accesses it will open small window during which race can
> happen.
> As Anshuman suggested in other thread for read are sequential and protected
> by sysfs layer. So lets apply locking only for RW attributes.

But what is the locking trying to protect? As far as I understand it is
just the registers which have to be atomically modified/read. So it seems
sufficient to just protect the register accesses with the lock.

So I am still not convinced.

Thanks.
--
Ashutosh


>
> +static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
> +{
> +	u32 reg_val;
> +
> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all
> platforms */
> +	if (value == PL1_DISABLE) {
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> +				     PKG_PWR_LIM_1_EN, 0);
> +		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
> +				     PKG_PWR_LIM_1_EN, 0);
> +
> +		if (reg_val & PKG_PWR_LIM_1_EN)
> +			return -EOPNOTSUPP;
> +	}
> +
> +	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> +	reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
> hwmon->scl_shift_power, SF_POWER);
> +	reg_val = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, reg_val);
> +
> +	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
> +			     PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, reg_val);
> +
> +	return 0;
> +}
>
> Regards,
> Badal
> >
> >>>
> >>> Also do we need to take forcewake? i915 had forcewake table so it would
> >>> take forcewake automatically but XE doesn't do that.
> >> Hwmon regs doesn't fall under GT domain so doesn't need forcewake.
> >
> > OK, great.
> >
> > Thanks.
> > --
> > Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-29 16:48           ` Dixit, Ashutosh
@ 2023-09-29 21:41             ` Dixit, Ashutosh
  2023-10-04  0:52               ` [Intel-xe] " Dixit, Ashutosh
  2023-10-04 10:18               ` Nilawar, Badal
  0 siblings, 2 replies; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-09-29 21:41 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
>

Hi Badal,

> On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
> >
> > On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> > > On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
> > >
> > >> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> > >>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> > >>>>
> > >>>> +static umode_t
> > >>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> > >>>> +		    u32 attr, int channel)
> > >>>> +{
> > >>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> > >>>> +	int ret;
> > >>>> +
> > >>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > >>>
> > >>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> > >>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> > >>> doesn't read/write registers.
> > >> Agreed, but visible function is called only once while registering hwmon
> > >> interface, which happen during driver probe. During driver probe device
> > >> will be in resumed state. So no harm in keeping
> > >> xe_device_mem_access_get/put in visible function.
> > >
> > > To me it doesn't make any sense to keep xe_device_mem_access_get/put
> > > anywhere except in xe_hwmon_process_reg where the HW access actually
> > > happens. We can eliminate xe_device_mem_access_get/put's all over the place
> > > if we do it. Isn't it?
> > Agreed, thought process here suggest that take rpm wakeref at lowest
> > possible level. I already tried this in rfc series and in some extent in
> > rev2. There is problem with this approach. See my comments below.
> > >
> > > The only restriction I have heard of (though not sure why) is that
> > > xe_device_mem_access_get/put should not be called under lock>. Though I am
> > > not sure it is for spinlock or also mutex. So as we were saying the locking
> > > will also need to move to xe_hwmon_process_reg.
> > Yes from rev2 comments its dangerous to take mutex before
> > xe_device_mem_access_get/put. With code for "PL1 disable/restore during
> > resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
> > resume -> disable pl1 -> mutex lock (dead lock here).
>
> But this is already the wrong order as mentioned below. If we follow the
> below order do we still see deadlock?
>
> > >
> > > So:
> > >
> > > xe_hwmon_process_reg()
> > > {
> > >	xe_device_mem_access_get
> > >	mutex_lock
> > >	...
> > >	mutex_unlock
> > >	xe_device_mem_access_put
> > > }
> > >
> > > So once again if this is not possible for some reason let's figure out why.
> > There are two problems with this approach.
> >
> > Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
> > access is happening 3 times, so there will be 3 rpm suspend/resume
> > cycles. I was observing the same with rfc implementation. So in subsequent
> > series xe_device_mem_access_put/get is moved to top level functions
> > i.e. hwmon hooks.
>
> This is not exactly correct because there is also a 1 second autosuspend
> delay which will prevent such rpm suspend/resume cycles:
>
> xe_pm_runtime_init:
>	pm_runtime_set_autosuspend_delay(dev, 1000);
>
>
> >
> > Problem 2: If locking moved inside xe_hwmon_process_reg then between two
> > subsequent reg accesses it will open small window during which race can
> > happen.
> > As Anshuman suggested in other thread for read are sequential and protected
> > by sysfs layer. So lets apply locking only for RW attributes.
>
> But what is the locking trying to protect? As far as I understand it is
> just the registers which have to be atomically modified/read. So it seems
> sufficient to just protect the register accesses with the lock.
>
> So I am still not convinced.

Let's figure out the locking first depending on what needs to be protected
(just registers or other data too). And then we can see where to put the
xe_device_mem_access_get/put's (following the rule that
xe_device_mem_access_get/put's should not be called under lock).

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-29 21:41             ` Dixit, Ashutosh
@ 2023-10-04  0:52               ` Dixit, Ashutosh
  2023-10-04  6:43                 ` Nilawar, Badal
  2023-10-04 10:18               ` Nilawar, Badal
  1 sibling, 1 reply; 32+ messages in thread
From: Dixit, Ashutosh @ 2023-10-04  0:52 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: linux-hwmon, linux, rodrigo.vivi, intel-xe, Anshuman Gupta,
	Matthew Brost

On Fri, 29 Sep 2023 14:41:22 -0700, Dixit, Ashutosh wrote:
>

Hi Badal,

Why did you merge the hwmon patches when there is still open discussion
below on the patches? According to upstream rules (I'm not sure if you know
about this) you should not merge patches, even if you have R-b's on the
patches, till all review comments are resolved.

Generally you are expected to either address the comments or reply to the
comments are at least inform that you are merging, disregarding the
comments. IMO you should at least have done one of these before merging.

Cc: @Vivi, Rodrigo

Thanks.
--
Ashutosh


> On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
> > On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
> > >
> > > On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> > > > On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
> > > >
> > > >> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> > > >>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> > > >>>>
> > > >>>> +static umode_t
> > > >>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> > > >>>> +		    u32 attr, int channel)
> > > >>>> +{
> > > >>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> > > >>>> +	int ret;
> > > >>>> +
> > > >>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > >>>
> > > >>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> > > >>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> > > >>> doesn't read/write registers.
> > > >> Agreed, but visible function is called only once while registering hwmon
> > > >> interface, which happen during driver probe. During driver probe device
> > > >> will be in resumed state. So no harm in keeping
> > > >> xe_device_mem_access_get/put in visible function.
> > > >
> > > > To me it doesn't make any sense to keep xe_device_mem_access_get/put
> > > > anywhere except in xe_hwmon_process_reg where the HW access actually
> > > > happens. We can eliminate xe_device_mem_access_get/put's all over the place
> > > > if we do it. Isn't it?
> > > Agreed, thought process here suggest that take rpm wakeref at lowest
> > > possible level. I already tried this in rfc series and in some extent in
> > > rev2. There is problem with this approach. See my comments below.
> > > >
> > > > The only restriction I have heard of (though not sure why) is that
> > > > xe_device_mem_access_get/put should not be called under lock>. Though I am
> > > > not sure it is for spinlock or also mutex. So as we were saying the locking
> > > > will also need to move to xe_hwmon_process_reg.
> > > Yes from rev2 comments its dangerous to take mutex before
> > > xe_device_mem_access_get/put. With code for "PL1 disable/restore during
> > > resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
> > > resume -> disable pl1 -> mutex lock (dead lock here).
> >
> > But this is already the wrong order as mentioned below. If we follow the
> > below order do we still see deadlock?
> >
> > > >
> > > > So:
> > > >
> > > > xe_hwmon_process_reg()
> > > > {
> > > >	xe_device_mem_access_get
> > > >	mutex_lock
> > > >	...
> > > >	mutex_unlock
> > > >	xe_device_mem_access_put
> > > > }
> > > >
> > > > So once again if this is not possible for some reason let's figure out why.
> > > There are two problems with this approach.
> > >
> > > Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
> > > access is happening 3 times, so there will be 3 rpm suspend/resume
> > > cycles. I was observing the same with rfc implementation. So in subsequent
> > > series xe_device_mem_access_put/get is moved to top level functions
> > > i.e. hwmon hooks.
> >
> > This is not exactly correct because there is also a 1 second autosuspend
> > delay which will prevent such rpm suspend/resume cycles:
> >
> > xe_pm_runtime_init:
> >	pm_runtime_set_autosuspend_delay(dev, 1000);
> >
> >
> > >
> > > Problem 2: If locking moved inside xe_hwmon_process_reg then between two
> > > subsequent reg accesses it will open small window during which race can
> > > happen.
> > > As Anshuman suggested in other thread for read are sequential and protected
> > > by sysfs layer. So lets apply locking only for RW attributes.
> >
> > But what is the locking trying to protect? As far as I understand it is
> > just the registers which have to be atomically modified/read. So it seems
> > sufficient to just protect the register accesses with the lock.
> >
> > So I am still not convinced.
>
> Let's figure out the locking first depending on what needs to be protected
> (just registers or other data too). And then we can see where to put the
> xe_device_mem_access_get/put's (following the rule that
> xe_device_mem_access_get/put's should not be called under lock).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-10-04  0:52               ` [Intel-xe] " Dixit, Ashutosh
@ 2023-10-04  6:43                 ` Nilawar, Badal
  2023-10-04 15:56                   ` Rodrigo Vivi
  0 siblings, 1 reply; 32+ messages in thread
From: Nilawar, Badal @ 2023-10-04  6:43 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: linux-hwmon, linux, rodrigo.vivi, intel-xe, Anshuman Gupta,
	Matthew Brost

Hi Anshutosh,

On 04-10-2023 06:22, Dixit, Ashutosh wrote:
> On Fri, 29 Sep 2023 14:41:22 -0700, Dixit, Ashutosh wrote:
>>
> 
> Hi Badal,
> 
> Why did you merge the hwmon patches when there is still open discussion
> below on the patches? According to upstream rules (I'm not sure if you know
> about this) you should not merge patches, even if you have R-b's on the
> patches, till all review comments are resolved.
> 
> Generally you are expected to either address the comments or reply to the
> comments are at least inform that you are merging, disregarding the > comments. IMO you should at least have done one of these before merging.

I did selective merging. I haven't merged 5th patch yet as locking is 
still in discussion. I am working on addressing locking and thought I 
will address some of your comments with it.

Thanks,
Badal
> 
> Cc: @Vivi, Rodrigo
> 
> Thanks.
> --
> Ashutosh
> 
> 
>> On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
>>> On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
>>>>
>>>> On 28-09-2023 10:25, Dixit, Ashutosh wrote:
>>>>> On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>>>>>
>>>>>> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
>>>>>>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>>>>>>>
>>>>>>>> +static umode_t
>>>>>>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>>>>>>>> +		    u32 attr, int channel)
>>>>>>>> +{
>>>>>>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>>>>>>>> +	int ret;
>>>>>>>> +
>>>>>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>>>>>
>>>>>>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
>>>>>>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
>>>>>>> doesn't read/write registers.
>>>>>> Agreed, but visible function is called only once while registering hwmon
>>>>>> interface, which happen during driver probe. During driver probe device
>>>>>> will be in resumed state. So no harm in keeping
>>>>>> xe_device_mem_access_get/put in visible function.
>>>>>
>>>>> To me it doesn't make any sense to keep xe_device_mem_access_get/put
>>>>> anywhere except in xe_hwmon_process_reg where the HW access actually
>>>>> happens. We can eliminate xe_device_mem_access_get/put's all over the place
>>>>> if we do it. Isn't it?
>>>> Agreed, thought process here suggest that take rpm wakeref at lowest
>>>> possible level. I already tried this in rfc series and in some extent in
>>>> rev2. There is problem with this approach. See my comments below.
>>>>>
>>>>> The only restriction I have heard of (though not sure why) is that
>>>>> xe_device_mem_access_get/put should not be called under lock>. Though I am
>>>>> not sure it is for spinlock or also mutex. So as we were saying the locking
>>>>> will also need to move to xe_hwmon_process_reg.
>>>> Yes from rev2 comments its dangerous to take mutex before
>>>> xe_device_mem_access_get/put. With code for "PL1 disable/restore during
>>>> resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
>>>> resume -> disable pl1 -> mutex lock (dead lock here).
>>>
>>> But this is already the wrong order as mentioned below. If we follow the
>>> below order do we still see deadlock?
>>>
>>>>>
>>>>> So:
>>>>>
>>>>> xe_hwmon_process_reg()
>>>>> {
>>>>> 	xe_device_mem_access_get
>>>>> 	mutex_lock
>>>>> 	...
>>>>> 	mutex_unlock
>>>>> 	xe_device_mem_access_put
>>>>> }
>>>>>
>>>>> So once again if this is not possible for some reason let's figure out why.
>>>> There are two problems with this approach.
>>>>
>>>> Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
>>>> access is happening 3 times, so there will be 3 rpm suspend/resume
>>>> cycles. I was observing the same with rfc implementation. So in subsequent
>>>> series xe_device_mem_access_put/get is moved to top level functions
>>>> i.e. hwmon hooks.
>>>
>>> This is not exactly correct because there is also a 1 second autosuspend
>>> delay which will prevent such rpm suspend/resume cycles:
>>>
>>> xe_pm_runtime_init:
>>> 	pm_runtime_set_autosuspend_delay(dev, 1000);
>>>
>>>
>>>>
>>>> Problem 2: If locking moved inside xe_hwmon_process_reg then between two
>>>> subsequent reg accesses it will open small window during which race can
>>>> happen.
>>>> As Anshuman suggested in other thread for read are sequential and protected
>>>> by sysfs layer. So lets apply locking only for RW attributes.
>>>
>>> But what is the locking trying to protect? As far as I understand it is
>>> just the registers which have to be atomically modified/read. So it seems
>>> sufficient to just protect the register accesses with the lock.
>>>
>>> So I am still not convinced.
>>
>> Let's figure out the locking first depending on what needs to be protected
>> (just registers or other data too). And then we can see where to put the
>> xe_device_mem_access_get/put's (following the rule that
>> xe_device_mem_access_get/put's should not be called under lock).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-09-29 21:41             ` Dixit, Ashutosh
  2023-10-04  0:52               ` [Intel-xe] " Dixit, Ashutosh
@ 2023-10-04 10:18               ` Nilawar, Badal
  1 sibling, 0 replies; 32+ messages in thread
From: Nilawar, Badal @ 2023-10-04 10:18 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe, linux-hwmon, anshuman.gupta, linux, andi.shyti,
	riana.tauro, matthew.brost, rodrigo.vivi

Hi Ashutosh,

On 30-09-2023 03:11, Dixit, Ashutosh wrote:
> On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
>>
> 
> Hi Badal,
> 
>> On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
>>>
>>> On 28-09-2023 10:25, Dixit, Ashutosh wrote:
>>>> On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>>>>
>>>>> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
>>>>>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>>>>>>
>>>>>>> +static umode_t
>>>>>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>>>>>>> +		    u32 attr, int channel)
>>>>>>> +{
>>>>>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>>>>>>> +	int ret;
>>>>>>> +
>>>>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>>>>
>>>>>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
>>>>>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
>>>>>> doesn't read/write registers.
>>>>> Agreed, but visible function is called only once while registering hwmon
>>>>> interface, which happen during driver probe. During driver probe device
>>>>> will be in resumed state. So no harm in keeping
>>>>> xe_device_mem_access_get/put in visible function.
>>>>
>>>> To me it doesn't make any sense to keep xe_device_mem_access_get/put
>>>> anywhere except in xe_hwmon_process_reg where the HW access actually
>>>> happens. We can eliminate xe_device_mem_access_get/put's all over the place
>>>> if we do it. Isn't it?
>>> Agreed, thought process here suggest that take rpm wakeref at lowest
>>> possible level. I already tried this in rfc series and in some extent in
>>> rev2. There is problem with this approach. See my comments below.
>>>>
>>>> The only restriction I have heard of (though not sure why) is that
>>>> xe_device_mem_access_get/put should not be called under lock>. Though I am
>>>> not sure it is for spinlock or also mutex. So as we were saying the locking
>>>> will also need to move to xe_hwmon_process_reg.
>>> Yes from rev2 comments its dangerous to take mutex before
>>> xe_device_mem_access_get/put. With code for "PL1 disable/restore during
>>> resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
>>> resume -> disable pl1 -> mutex lock (dead lock here).
>>
>> But this is already the wrong order as mentioned below. If we follow the
>> below order do we still see deadlock?
>>
>>>>
>>>> So:
>>>>
>>>> xe_hwmon_process_reg()
>>>> {
>>>> 	xe_device_mem_access_get
>>>> 	mutex_lock
>>>> 	...
>>>> 	mutex_unlock
>>>> 	xe_device_mem_access_put
>>>> }
>>>>
>>>> So once again if this is not possible for some reason let's figure out why.
>>> There are two problems with this approach.
>>>
>>> Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
>>> access is happening 3 times, so there will be 3 rpm suspend/resume
>>> cycles. I was observing the same with rfc implementation. So in subsequent
>>> series xe_device_mem_access_put/get is moved to top level functions
>>> i.e. hwmon hooks.
>>
>> This is not exactly correct because there is also a 1 second autosuspend
>> delay which will prevent such rpm suspend/resume cycles:
>>
>> xe_pm_runtime_init:
>> 	pm_runtime_set_autosuspend_delay(dev, 1000);
>>
rpm auto suspend delay can be 0 as well, IGT does set it to 0. In that 
case there will be rpm cycle for every register access. So it better to 
keep xe_device_mem_access_get/put at attribute level i.e. in hwmon hooks.
>>
>>>
>>> Problem 2: If locking moved inside xe_hwmon_process_reg then between two
>>> subsequent reg accesses it will open small window during which race can
>>> happen.
>>> As Anshuman suggested in other thread for read are sequential and protected
>>> by sysfs layer. So lets apply locking only for RW attributes.
>>
>> But what is the locking trying to protect? As far as I understand it is
>> just the registers which have to be atomically modified/read. So it seems
>> sufficient to just protect the register accesses with the lock.
>>
>> So I am still not convinced.
In i915 initially rmw accesses were protected with lock but later with 
addition of PL1 disable (during resume) logic in some extent locking got 
extended to attribute level.

Another scenario where locking is needed is for rw attributes where 
write and read can happen from different threads.

For readonly attributes as per this 
(https://elixir.bootlin.com/linux/latest/source/fs/seq_file.c) locking 
is not needed.

I think lets apply locking at attribute level.
> 
> Let's figure out the locking first depending on what needs to be protected
> (just registers or other data too). And then we can see where to put the
> xe_device_mem_access_get/put's (following the rule that
> xe_device_mem_access_get/put's should not be called under lock).


Regards,
Badal
> 
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-10-04  6:43                 ` Nilawar, Badal
@ 2023-10-04 15:56                   ` Rodrigo Vivi
  2023-10-04 16:11                     ` Rodrigo Vivi
  0 siblings, 1 reply; 32+ messages in thread
From: Rodrigo Vivi @ 2023-10-04 15:56 UTC (permalink / raw)
  To: Nilawar, Badal
  Cc: Dixit, Ashutosh, linux-hwmon, linux, intel-xe, Anshuman Gupta,
	Matthew Brost

On Wed, Oct 04, 2023 at 12:13:06PM +0530, Nilawar, Badal wrote:
> Hi Anshutosh,
> 
> On 04-10-2023 06:22, Dixit, Ashutosh wrote:
> > On Fri, 29 Sep 2023 14:41:22 -0700, Dixit, Ashutosh wrote:
> > > 
> > 
> > Hi Badal,
> > 
> > Why did you merge the hwmon patches when there is still open discussion
> > below on the patches? According to upstream rules (I'm not sure if you know
> > about this) you should not merge patches, even if you have R-b's on the
> > patches, till all review comments are resolved.
> > 
> > Generally you are expected to either address the comments or reply to the
> > comments are at least inform that you are merging, disregarding the > comments. IMO you should at least have done one of these before merging.
> 
> I did selective merging. I haven't merged 5th patch yet as locking is still
> in discussion. I am working on addressing locking and thought I will address
> some of your comments with it.

Just to ensure the split is clear to everyone and that we have CI running on
the exact chunk that is getting merged, next time, please split the series,
rebase and resend the ones that are ready. you might even use --subject-prefix=CI

and as always, let's not rush things in and be sure that all questions
and concerns are addressed.

Thanks,
Rodrigo.

> 
> Thanks,
> Badal
> > 
> > Cc: @Vivi, Rodrigo
> > 
> > Thanks.
> > --
> > Ashutosh
> > 
> > 
> > > On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
> > > > On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
> > > > > 
> > > > > On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> > > > > > On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
> > > > > > 
> > > > > > > On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> > > > > > > > On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> > > > > > > > > 
> > > > > > > > > +static umode_t
> > > > > > > > > +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> > > > > > > > > +		    u32 attr, int channel)
> > > > > > > > > +{
> > > > > > > > > +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> > > > > > > > > +	int ret;
> > > > > > > > > +
> > > > > > > > > +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > > > > > > 
> > > > > > > > Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> > > > > > > > is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> > > > > > > > doesn't read/write registers.
> > > > > > > Agreed, but visible function is called only once while registering hwmon
> > > > > > > interface, which happen during driver probe. During driver probe device
> > > > > > > will be in resumed state. So no harm in keeping
> > > > > > > xe_device_mem_access_get/put in visible function.
> > > > > > 
> > > > > > To me it doesn't make any sense to keep xe_device_mem_access_get/put
> > > > > > anywhere except in xe_hwmon_process_reg where the HW access actually
> > > > > > happens. We can eliminate xe_device_mem_access_get/put's all over the place
> > > > > > if we do it. Isn't it?
> > > > > Agreed, thought process here suggest that take rpm wakeref at lowest
> > > > > possible level. I already tried this in rfc series and in some extent in
> > > > > rev2. There is problem with this approach. See my comments below.
> > > > > > 
> > > > > > The only restriction I have heard of (though not sure why) is that
> > > > > > xe_device_mem_access_get/put should not be called under lock>. Though I am
> > > > > > not sure it is for spinlock or also mutex. So as we were saying the locking
> > > > > > will also need to move to xe_hwmon_process_reg.
> > > > > Yes from rev2 comments its dangerous to take mutex before
> > > > > xe_device_mem_access_get/put. With code for "PL1 disable/restore during
> > > > > resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
> > > > > resume -> disable pl1 -> mutex lock (dead lock here).
> > > > 
> > > > But this is already the wrong order as mentioned below. If we follow the
> > > > below order do we still see deadlock?
> > > > 
> > > > > > 
> > > > > > So:
> > > > > > 
> > > > > > xe_hwmon_process_reg()
> > > > > > {
> > > > > > 	xe_device_mem_access_get
> > > > > > 	mutex_lock
> > > > > > 	...
> > > > > > 	mutex_unlock
> > > > > > 	xe_device_mem_access_put
> > > > > > }
> > > > > > 
> > > > > > So once again if this is not possible for some reason let's figure out why.
> > > > > There are two problems with this approach.
> > > > > 
> > > > > Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
> > > > > access is happening 3 times, so there will be 3 rpm suspend/resume
> > > > > cycles. I was observing the same with rfc implementation. So in subsequent
> > > > > series xe_device_mem_access_put/get is moved to top level functions
> > > > > i.e. hwmon hooks.
> > > > 
> > > > This is not exactly correct because there is also a 1 second autosuspend
> > > > delay which will prevent such rpm suspend/resume cycles:
> > > > 
> > > > xe_pm_runtime_init:
> > > > 	pm_runtime_set_autosuspend_delay(dev, 1000);
> > > > 
> > > > 
> > > > > 
> > > > > Problem 2: If locking moved inside xe_hwmon_process_reg then between two
> > > > > subsequent reg accesses it will open small window during which race can
> > > > > happen.
> > > > > As Anshuman suggested in other thread for read are sequential and protected
> > > > > by sysfs layer. So lets apply locking only for RW attributes.
> > > > 
> > > > But what is the locking trying to protect? As far as I understand it is
> > > > just the registers which have to be atomically modified/read. So it seems
> > > > sufficient to just protect the register accesses with the lock.
> > > > 
> > > > So I am still not convinced.
> > > 
> > > Let's figure out the locking first depending on what needs to be protected
> > > (just registers or other data too). And then we can see where to put the
> > > xe_device_mem_access_get/put's (following the rule that
> > > xe_device_mem_access_get/put's should not be called under lock).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes
  2023-10-04 15:56                   ` Rodrigo Vivi
@ 2023-10-04 16:11                     ` Rodrigo Vivi
  0 siblings, 0 replies; 32+ messages in thread
From: Rodrigo Vivi @ 2023-10-04 16:11 UTC (permalink / raw)
  To: Nilawar, Badal; +Cc: linux-hwmon, intel-xe, linux

On Wed, Oct 04, 2023 at 11:56:33AM -0400, Rodrigo Vivi wrote:
> On Wed, Oct 04, 2023 at 12:13:06PM +0530, Nilawar, Badal wrote:
> > Hi Anshutosh,
> > 
> > On 04-10-2023 06:22, Dixit, Ashutosh wrote:
> > > On Fri, 29 Sep 2023 14:41:22 -0700, Dixit, Ashutosh wrote:
> > > > 
> > > 
> > > Hi Badal,
> > > 
> > > Why did you merge the hwmon patches when there is still open discussion
> > > below on the patches? According to upstream rules (I'm not sure if you know
> > > about this) you should not merge patches, even if you have R-b's on the
> > > patches, till all review comments are resolved.
> > > 
> > > Generally you are expected to either address the comments or reply to the
> > > comments are at least inform that you are merging, disregarding the > comments. IMO you should at least have done one of these before merging.
> > 
> > I did selective merging. I haven't merged 5th patch yet as locking is still
> > in discussion. I am working on addressing locking and thought I will address
> > some of your comments with it.

There was still an open discussion going around the (merged) patch 1, regarding
the positioning of the the mem_access get/put. So, next time hold a bit before
pushing. But the positioning of those mem_access get/put are safe although maybe
not ideal... (needed?!). Anyway that can be a follow up fix or improvement and
I'm okay with the way it currently is in the code.

> 
> Just to ensure the split is clear to everyone and that we have CI running on
> the exact chunk that is getting merged, next time, please split the series,
> rebase and resend the ones that are ready. you might even use --subject-prefix=CI
> 
> and as always, let's not rush things in and be sure that all questions
> and concerns are addressed.
> 
> Thanks,
> Rodrigo.
> 
> > 
> > Thanks,
> > Badal
> > > 
> > > Cc: @Vivi, Rodrigo
> > > 
> > > Thanks.
> > > --
> > > Ashutosh
> > > 
> > > 
> > > > On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
> > > > > On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
> > > > > > 
> > > > > > On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> > > > > > > On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
> > > > > > > 
> > > > > > > > On 27-09-2023 10:23, Dixit, Ashutosh wrote:
> > > > > > > > > On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
> > > > > > > > > > 
> > > > > > > > > > +static umode_t
> > > > > > > > > > +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> > > > > > > > > > +		    u32 attr, int channel)
> > > > > > > > > > +{
> > > > > > > > > > +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
> > > > > > > > > > +	int ret;
> > > > > > > > > > +
> > > > > > > > > > +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
> > > > > > > > > 
> > > > > > > > > Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
> > > > > > > > > is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
> > > > > > > > > doesn't read/write registers.
> > > > > > > > Agreed, but visible function is called only once while registering hwmon
> > > > > > > > interface, which happen during driver probe. During driver probe device
> > > > > > > > will be in resumed state. So no harm in keeping
> > > > > > > > xe_device_mem_access_get/put in visible function.
> > > > > > > 
> > > > > > > To me it doesn't make any sense to keep xe_device_mem_access_get/put
> > > > > > > anywhere except in xe_hwmon_process_reg where the HW access actually
> > > > > > > happens. We can eliminate xe_device_mem_access_get/put's all over the place
> > > > > > > if we do it. Isn't it?
> > > > > > Agreed, thought process here suggest that take rpm wakeref at lowest
> > > > > > possible level. I already tried this in rfc series and in some extent in
> > > > > > rev2. There is problem with this approach. See my comments below.
> > > > > > > 
> > > > > > > The only restriction I have heard of (though not sure why) is that
> > > > > > > xe_device_mem_access_get/put should not be called under lock>. Though I am
> > > > > > > not sure it is for spinlock or also mutex. So as we were saying the locking
> > > > > > > will also need to move to xe_hwmon_process_reg.
> > > > > > Yes from rev2 comments its dangerous to take mutex before
> > > > > > xe_device_mem_access_get/put. With code for "PL1 disable/restore during
> > > > > > resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
> > > > > > resume -> disable pl1 -> mutex lock (dead lock here).
> > > > > 
> > > > > But this is already the wrong order as mentioned below. If we follow the
> > > > > below order do we still see deadlock?
> > > > > 
> > > > > > > 
> > > > > > > So:
> > > > > > > 
> > > > > > > xe_hwmon_process_reg()
> > > > > > > {
> > > > > > > 	xe_device_mem_access_get
> > > > > > > 	mutex_lock
> > > > > > > 	...
> > > > > > > 	mutex_unlock
> > > > > > > 	xe_device_mem_access_put
> > > > > > > }
> > > > > > > 
> > > > > > > So once again if this is not possible for some reason let's figure out why.
> > > > > > There are two problems with this approach.
> > > > > > 
> > > > > > Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
> > > > > > access is happening 3 times, so there will be 3 rpm suspend/resume
> > > > > > cycles. I was observing the same with rfc implementation. So in subsequent
> > > > > > series xe_device_mem_access_put/get is moved to top level functions
> > > > > > i.e. hwmon hooks.
> > > > > 
> > > > > This is not exactly correct because there is also a 1 second autosuspend
> > > > > delay which will prevent such rpm suspend/resume cycles:
> > > > > 
> > > > > xe_pm_runtime_init:
> > > > > 	pm_runtime_set_autosuspend_delay(dev, 1000);
> > > > > 
> > > > > 
> > > > > > 
> > > > > > Problem 2: If locking moved inside xe_hwmon_process_reg then between two
> > > > > > subsequent reg accesses it will open small window during which race can
> > > > > > happen.
> > > > > > As Anshuman suggested in other thread for read are sequential and protected
> > > > > > by sysfs layer. So lets apply locking only for RW attributes.
> > > > > 
> > > > > But what is the locking trying to protect? As far as I understand it is
> > > > > just the registers which have to be atomically modified/read. So it seems
> > > > > sufficient to just protect the register accesses with the lock.
> > > > > 
> > > > > So I am still not convinced.
> > > > 
> > > > Let's figure out the locking first depending on what needs to be protected
> > > > (just registers or other data too). And then we can see where to put the
> > > > xe_device_mem_access_get/put's (following the rule that
> > > > xe_device_mem_access_get/put's should not be called under lock).

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-10-04 16:12 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-25  8:18 [PATCH v6 0/5] Add HWMON support for DGFX Badal Nilawar
2023-09-25  8:18 ` [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes Badal Nilawar
2023-09-25  8:58   ` Andi Shyti
2023-09-27  4:45   ` Dixit, Ashutosh
2023-09-27 10:28     ` Nilawar, Badal
2023-09-28  4:54       ` Dixit, Ashutosh
2023-09-27  4:53   ` Dixit, Ashutosh
2023-09-27  8:39     ` Nilawar, Badal
2023-09-28  4:55       ` Dixit, Ashutosh
2023-09-29  6:37         ` Nilawar, Badal
2023-09-29 16:48           ` Dixit, Ashutosh
2023-09-29 21:41             ` Dixit, Ashutosh
2023-10-04  0:52               ` [Intel-xe] " Dixit, Ashutosh
2023-10-04  6:43                 ` Nilawar, Badal
2023-10-04 15:56                   ` Rodrigo Vivi
2023-10-04 16:11                     ` Rodrigo Vivi
2023-10-04 10:18               ` Nilawar, Badal
2023-09-28  4:55   ` Dixit, Ashutosh
2023-09-25  8:18 ` [PATCH v6 2/5] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
2023-09-25  9:03   ` Andi Shyti
2023-09-25  8:18 ` [PATCH v6 3/5] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
2023-09-25  9:04   ` Andi Shyti
2023-09-25  8:18 ` [PATCH v6 4/5] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
2023-09-25 11:49   ` Andi Shyti
2023-09-25  8:18 ` [PATCH v6 5/5] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
2023-09-25 11:56   ` Andi Shyti
     [not found]     ` <e5801f36-2f9a-6d24-7af2-1e7174f2e0b4@intel.com>
2023-09-26  8:01       ` Andi Shyti
2023-09-26  9:00         ` Nilawar, Badal
2023-09-26 21:01           ` Andi Shyti
2023-09-27  3:32             ` Dixit, Ashutosh
2023-09-27  9:04               ` Nilawar, Badal
2023-09-27  9:31                 ` Gupta, Anshuman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).