public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Intel uncore driver ELC support
@ 2024-08-28 15:34 Tero Kristo
  2024-08-28 15:34 ` [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation Tero Kristo
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Tero Kristo @ 2024-08-28 15:34 UTC (permalink / raw)
  To: ilpo.jarvinen, hdegoede, srinivas.pandruvada
  Cc: platform-driver-x86, linux-kernel

Hi,

Updated based on comments received for v1. No functional changes.
  * Updated documentation (patch #1)
  * Converted one long sequence of if (...)'s to a switch () (patch #2)

-Tero


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-28 15:34 [PATCH v2 0/3] Intel uncore driver ELC support Tero Kristo
@ 2024-08-28 15:34 ` Tero Kristo
  2024-08-29  9:18   ` Ilpo Järvinen
  2024-08-28 15:34 ` [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control Tero Kristo
  2024-08-28 15:34 ` [PATCH v2 3/3] platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface Tero Kristo
  2 siblings, 1 reply; 12+ messages in thread
From: Tero Kristo @ 2024-08-28 15:34 UTC (permalink / raw)
  To: ilpo.jarvinen, hdegoede, srinivas.pandruvada
  Cc: platform-driver-x86, linux-kernel

Added documentation about the functionality of efficiency vs. latency tradeoff
control in intel Xeon processors, and how this is configured via sysfs.

Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
---
v2:
  * Largely re-wrote the documentation

 .../pm/intel_uncore_frequency_scaling.rst     | 59 +++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
index 5ab3440e6cee..26ded32b06f5 100644
--- a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
+++ b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
@@ -113,3 +113,62 @@ to apply at each uncore* level.
 
 Support for "current_freq_khz" is available only at each fabric cluster
 level (i.e., in uncore* directory).
+
+Efficiency vs. Latency Tradeoff
+-------------------------------
+
+The Efficiency Latency Control (ELC) feature improves performance
+per watt. With this feature hardware power management algorithms
+optimize trade-off between latency and power consumption. For some
+latency sensitive workloads further tuning can be done by SW to
+get desired performance.
+
+The hardware monitors the average CPU utilization across all cores
+in a power domain at regular intervals and decides an uncore frequency.
+While this may result in the best performance per watt, workload may be
+expecting higher performance at the expense of power. Consider an
+application that intermittently wakes up to perform memory reads on an
+otherwise idle system. In such cases, if hardware lowers uncore
+frequency, then there may be delay in ramp up of frequency to meet
+target performance.
+
+The ELC control defines some parameters which can be changed from SW.
+If the average CPU utilization is below a user defined threshold
+(elc_low_threshold_percent attribute below), the user defined uncore
+frequency floor frequency will be used (elc_floor_freq_khz attribute
+below) instead of hardware calculated minimum.
+
+Similarly in high load scenario where the CPU utilization goes above
+the high threshold value (elc_high_threshold_percent attribute below)
+instead of jumping to maximum uncore frequency, frequency is increased
+in 100MHz steps. This avoids consuming unnecessarily high power
+immediately with CPU utilization spikes.
+
+Attributes for efficiency latency control:
+
+``elc_floor_freq_khz``
+	This attribute is used to get/set the efficiency latency floor frequency.
+	If this variable is lower than the 'min_freq_khz', it is ignored by
+	the firmware.
+
+``elc_low_threshold_percent``
+	This attribute is used to get/set the efficiency latency control low
+	threshold. This attribute is in percentages of CPU utilization.
+
+``elc_high_threshold_percent``
+	This attribute is used to get/set the efficiency latency control high
+	threshold. This attribute is in percentages of CPU utilization.
+
+``elc_high_threshold_enable``
+	This attribute is used to enable/disable the efficiency latency control
+	high threshold. Write '1' to enable, '0' to disable.
+
+Example system configuration below, which does following:
+  * when CPU utilization is less than 10%: sets uncore frequency to 800MHz
+  * when CPU utilization is higher than 95%: increases uncore frequency in
+    100MHz steps, until power limit is reached
+
+  elc_floor_freq_khz:800000
+  elc_high_threshold_percent:95
+  elc_high_threshold_enable:1
+  elc_low_threshold_percent:10
-- 
2.43.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control
  2024-08-28 15:34 [PATCH v2 0/3] Intel uncore driver ELC support Tero Kristo
  2024-08-28 15:34 ` [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation Tero Kristo
@ 2024-08-28 15:34 ` Tero Kristo
  2024-08-29  9:14   ` Ilpo Järvinen
  2024-08-28 15:34 ` [PATCH v2 3/3] platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface Tero Kristo
  2 siblings, 1 reply; 12+ messages in thread
From: Tero Kristo @ 2024-08-28 15:34 UTC (permalink / raw)
  To: ilpo.jarvinen, hdegoede, srinivas.pandruvada
  Cc: platform-driver-x86, linux-kernel

Add efficiency latency control support to the TPMI uncore driver. This
defines two new threshold values for controlling uncore frequency, low
threshold and high threshold. When CPU utilization is below low threshold,
the user configurable floor latency control frequency can be used by the
system. When CPU utilization is above high threshold, the uncore frequency
is increased in 100MHz steps until power limit is reached.

Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
---
v2:
  * Converted a long sequence of if (...)'s to a switch

 .../uncore-frequency-common.h                 |   4 +
 .../uncore-frequency/uncore-frequency-tpmi.c  | 158 +++++++++++++++++-
 2 files changed, 160 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
index 4c245b945e4e..b5c7311bfa05 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
@@ -70,6 +70,10 @@ enum uncore_index {
 	UNCORE_INDEX_MIN_FREQ,
 	UNCORE_INDEX_MAX_FREQ,
 	UNCORE_INDEX_CURRENT_FREQ,
+	UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD,
+	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD,
+	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE,
+	UNCORE_INDEX_EFF_LAT_CTRL_FREQ,
 };
 
 int uncore_freq_common_init(int (*read)(struct uncore_data *data, unsigned int *value,
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
index 9fa3037c03d1..50b28b4b1fc0 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
@@ -30,6 +30,7 @@
 
 #define	UNCORE_MAJOR_VERSION		0
 #define	UNCORE_MINOR_VERSION		2
+#define UNCORE_ELC_SUPPORTED_VERSION	2
 #define UNCORE_HEADER_INDEX		0
 #define UNCORE_FABRIC_CLUSTER_OFFSET	8
 
@@ -46,6 +47,7 @@ struct tpmi_uncore_struct;
 /* Information for each cluster */
 struct tpmi_uncore_cluster_info {
 	bool root_domain;
+	bool elc_supported;
 	u8 __iomem *cluster_base;
 	struct uncore_data uncore_data;
 	struct tpmi_uncore_struct *uncore_root;
@@ -75,6 +77,10 @@ struct tpmi_uncore_struct {
 /* Bit definitions for CONTROL register */
 #define UNCORE_MAX_RATIO_MASK				GENMASK_ULL(14, 8)
 #define UNCORE_MIN_RATIO_MASK				GENMASK_ULL(21, 15)
+#define UNCORE_EFF_LAT_CTRL_RATIO_MASK			GENMASK_ULL(28, 22)
+#define UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK		GENMASK_ULL(38, 32)
+#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE	BIT(39)
+#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK		GENMASK_ULL(46, 40)
 
 /* Helper function to read MMIO offset for max/min control frequency */
 static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info,
@@ -89,6 +95,48 @@ static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info,
 		*value = FIELD_GET(UNCORE_MIN_RATIO_MASK, control) * UNCORE_FREQ_KHZ_MULTIPLIER;
 }
 
+/* Helper function to read efficiency latency control values over MMIO */
+static int read_eff_lat_ctrl(struct uncore_data *data, unsigned int *val, enum uncore_index index)
+{
+	struct tpmi_uncore_cluster_info *cluster_info;
+	u64 ctrl;
+
+	cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data);
+	if (cluster_info->root_domain)
+		return -ENODATA;
+
+	if (!cluster_info->elc_supported)
+		return -EOPNOTSUPP;
+
+	ctrl = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
+
+	switch (index) {
+	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
+		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, ctrl);
+		*val *= 100;
+		*val = DIV_ROUND_UP(*val, FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK));
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
+		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, ctrl);
+		*val *= 100;
+		*val = DIV_ROUND_UP(*val, FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK));
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
+		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE, ctrl);
+		break;
+	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
+		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_RATIO_MASK, ctrl) * UNCORE_FREQ_KHZ_MULTIPLIER;
+		break;
+
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
 #define UNCORE_MAX_RATIO	FIELD_MAX(UNCORE_MAX_RATIO_MASK)
 
 /* Helper for sysfs read for max/min frequencies. Called under mutex locks */
@@ -137,6 +185,82 @@ static int uncore_read_control_freq(struct uncore_data *data, unsigned int *valu
 	return 0;
 }
 
+/* Helper function for writing efficiency latency control values over MMIO */
+static int write_eff_lat_ctrl(struct uncore_data *data, unsigned int val, enum uncore_index index)
+{
+	struct tpmi_uncore_cluster_info *cluster_info;
+	u64 control;
+
+	cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data);
+
+	if (cluster_info->root_domain)
+		return -ENODATA;
+
+	if (!cluster_info->elc_supported)
+		return -EOPNOTSUPP;
+
+	switch (index) {
+	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
+		if (val > 100)
+			return -EINVAL;
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
+		if (val > 100)
+			return -EINVAL;
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
+		if (val > 1)
+			return -EINVAL;
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
+		val /= UNCORE_FREQ_KHZ_MULTIPLIER;
+		if (val > FIELD_MAX(UNCORE_EFF_LAT_CTRL_RATIO_MASK))
+			return -EINVAL;
+		break;
+
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	control = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
+
+	switch (index) {
+	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
+		val *= FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK);
+		val /= 100;
+		control &= ~UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK;
+		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, val);
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
+		val *= FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK);
+		val /= 100;
+		control &= ~UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK;
+		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, val);
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
+		control &= ~UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE;
+		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE, val);
+		break;
+
+	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
+		control &= ~UNCORE_EFF_LAT_CTRL_RATIO_MASK;
+		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_RATIO_MASK, val);
+		break;
+
+	default:
+		break;
+	}
+
+	writeq(control, cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
+
+	return 0;
+}
+
 /* Helper function to write MMIO offset for max/min control frequency */
 static void write_control_freq(struct tpmi_uncore_cluster_info *cluster_info, unsigned int input,
 			      unsigned int index)
@@ -156,7 +280,7 @@ static void write_control_freq(struct tpmi_uncore_cluster_info *cluster_info, un
 	writeq(control, (cluster_info->cluster_base + UNCORE_CONTROL_INDEX));
 }
 
-/* Callback for sysfs write for max/min frequencies. Called under mutex locks */
+/* Helper for sysfs write for max/min frequencies. Called under mutex locks */
 static int uncore_write_control_freq(struct uncore_data *data, unsigned int input,
 				     enum uncore_index index)
 {
@@ -234,6 +358,33 @@ static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncor
 	case UNCORE_INDEX_CURRENT_FREQ:
 		return uncore_read_freq(data, value);
 
+	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
+	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
+		return read_eff_lat_ctrl(data, value, index);
+
+	default:
+		break;
+	}
+
+	return -EOPNOTSUPP;
+}
+
+/* Callback for sysfs write for TPMI uncore data. Called under mutex locks. */
+static int uncore_write(struct uncore_data *data, unsigned int value, enum uncore_index index)
+{
+	switch (index) {
+	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
+	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
+	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
+		return write_eff_lat_ctrl(data, value, index);
+
+	case UNCORE_INDEX_MIN_FREQ:
+	case UNCORE_INDEX_MAX_FREQ:
+		return uncore_write_control_freq(data, value, index);
+
 	default:
 		break;
 	}
@@ -291,7 +442,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
 		return -EINVAL;
 
 	/* Register callbacks to uncore core */
-	ret = uncore_freq_common_init(uncore_read, uncore_write_control_freq);
+	ret = uncore_freq_common_init(uncore_read, uncore_write);
 	if (ret)
 		return ret;
 
@@ -409,6 +560,9 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
 
 			cluster_info->uncore_root = tpmi_uncore;
 
+			if (TPMI_MINOR_VERSION(pd_info->ufs_header_ver) >= UNCORE_ELC_SUPPORTED_VERSION)
+				cluster_info->elc_supported = true;
+
 			ret = uncore_freq_add_entry(&cluster_info->uncore_data, 0);
 			if (ret) {
 				cluster_info->cluster_base = NULL;
-- 
2.43.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface
  2024-08-28 15:34 [PATCH v2 0/3] Intel uncore driver ELC support Tero Kristo
  2024-08-28 15:34 ` [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation Tero Kristo
  2024-08-28 15:34 ` [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control Tero Kristo
@ 2024-08-28 15:34 ` Tero Kristo
  2 siblings, 0 replies; 12+ messages in thread
From: Tero Kristo @ 2024-08-28 15:34 UTC (permalink / raw)
  To: ilpo.jarvinen, hdegoede, srinivas.pandruvada
  Cc: platform-driver-x86, linux-kernel

Add the TPMI efficiency latency control fields to the sysfs interface.
The sysfs files are mapped to the TPMI uncore driver via the registered
uncore_read and uncore_write driver callbacks. These fields are not
populated on older non TPMI hardware.

Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
v2:
  * Added Ilpo's reviewed by tag

 .../uncore-frequency-common.c                 | 42 ++++++++++++++++---
 .../uncore-frequency-common.h                 | 13 +++++-
 2 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c
index 4e880585cbe4..e22b683a7a43 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c
@@ -60,11 +60,16 @@ static ssize_t show_attr(struct uncore_data *data, char *buf, enum uncore_index
 static ssize_t store_attr(struct uncore_data *data, const char *buf, ssize_t count,
 			  enum uncore_index index)
 {
-	unsigned int input;
+	unsigned int input = 0;
 	int ret;
 
-	if (kstrtouint(buf, 10, &input))
-		return -EINVAL;
+	if (index == UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE) {
+		if (kstrtobool(buf, (bool *)&input))
+			return -EINVAL;
+	} else {
+		if (kstrtouint(buf, 10, &input))
+			return -EINVAL;
+	}
 
 	mutex_lock(&uncore_lock);
 	ret = uncore_write(data, input, index);
@@ -103,6 +108,18 @@ show_uncore_attr(max_freq_khz, UNCORE_INDEX_MAX_FREQ);
 
 show_uncore_attr(current_freq_khz, UNCORE_INDEX_CURRENT_FREQ);
 
+store_uncore_attr(elc_low_threshold_percent, UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD);
+store_uncore_attr(elc_high_threshold_percent, UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD);
+store_uncore_attr(elc_high_threshold_enable,
+		  UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE);
+store_uncore_attr(elc_floor_freq_khz, UNCORE_INDEX_EFF_LAT_CTRL_FREQ);
+
+show_uncore_attr(elc_low_threshold_percent, UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD);
+show_uncore_attr(elc_high_threshold_percent, UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD);
+show_uncore_attr(elc_high_threshold_enable,
+		 UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE);
+show_uncore_attr(elc_floor_freq_khz, UNCORE_INDEX_EFF_LAT_CTRL_FREQ);
+
 #define show_uncore_data(member_name)					\
 	static ssize_t show_##member_name(struct kobject *kobj,	\
 					   struct kobj_attribute *attr, char *buf)\
@@ -146,7 +163,8 @@ show_uncore_data(initial_max_freq_khz);
 
 static int create_attr_group(struct uncore_data *data, char *name)
 {
-	int ret, freq, index = 0;
+	int ret, index = 0;
+	unsigned int val;
 
 	init_attribute_rw(max_freq_khz);
 	init_attribute_rw(min_freq_khz);
@@ -168,10 +186,24 @@ static int create_attr_group(struct uncore_data *data, char *name)
 	data->uncore_attrs[index++] = &data->initial_min_freq_khz_kobj_attr.attr;
 	data->uncore_attrs[index++] = &data->initial_max_freq_khz_kobj_attr.attr;
 
-	ret = uncore_read(data, &freq, UNCORE_INDEX_CURRENT_FREQ);
+	ret = uncore_read(data, &val, UNCORE_INDEX_CURRENT_FREQ);
 	if (!ret)
 		data->uncore_attrs[index++] = &data->current_freq_khz_kobj_attr.attr;
 
+	ret = uncore_read(data, &val, UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD);
+	if (!ret) {
+		init_attribute_rw(elc_low_threshold_percent);
+		init_attribute_rw(elc_high_threshold_percent);
+		init_attribute_rw(elc_high_threshold_enable);
+		init_attribute_rw(elc_floor_freq_khz);
+
+		data->uncore_attrs[index++] = &data->elc_low_threshold_percent_kobj_attr.attr;
+		data->uncore_attrs[index++] = &data->elc_high_threshold_percent_kobj_attr.attr;
+		data->uncore_attrs[index++] =
+			&data->elc_high_threshold_enable_kobj_attr.attr;
+		data->uncore_attrs[index++] = &data->elc_floor_freq_khz_kobj_attr.attr;
+	}
+
 	data->uncore_attrs[index] = NULL;
 
 	data->uncore_attr_group.name = name;
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
index b5c7311bfa05..26c854cd5d97 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
@@ -34,6 +34,13 @@
  * @domain_id_kobj_attr: Storage for kobject attribute domain_id
  * @fabric_cluster_id_kobj_attr: Storage for kobject attribute fabric_cluster_id
  * @package_id_kobj_attr: Storage for kobject attribute package_id
+ * @elc_low_threshold_percent_kobj_attr:
+		Storage for kobject attribute elc_low_threshold_percent
+ * @elc_high_threshold_percent_kobj_attr:
+		Storage for kobject attribute elc_high_threshold_percent
+ * @elc_high_threshold_enable_kobj_attr:
+		Storage for kobject attribute elc_high_threshold_enable
+ * @elc_floor_freq_khz_kobj_attr: Storage for kobject attribute elc_floor_freq_khz
  * @uncore_attrs:	Attribute storage for group creation
  *
  * This structure is used to encapsulate all data related to uncore sysfs
@@ -61,7 +68,11 @@ struct uncore_data {
 	struct kobj_attribute domain_id_kobj_attr;
 	struct kobj_attribute fabric_cluster_id_kobj_attr;
 	struct kobj_attribute package_id_kobj_attr;
-	struct attribute *uncore_attrs[9];
+	struct kobj_attribute elc_low_threshold_percent_kobj_attr;
+	struct kobj_attribute elc_high_threshold_percent_kobj_attr;
+	struct kobj_attribute elc_high_threshold_enable_kobj_attr;
+	struct kobj_attribute elc_floor_freq_khz_kobj_attr;
+	struct attribute *uncore_attrs[13];
 };
 
 #define UNCORE_DOMAIN_ID_INVALID	-1
-- 
2.43.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control
  2024-08-28 15:34 ` [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control Tero Kristo
@ 2024-08-29  9:14   ` Ilpo Järvinen
  2024-08-30  7:21     ` Tero Kristo
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2024-08-29  9:14 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

[-- Attachment #1: Type: text/plain, Size: 10292 bytes --]

On Wed, 28 Aug 2024, Tero Kristo wrote:

> Add efficiency latency control support to the TPMI uncore driver. This
> defines two new threshold values for controlling uncore frequency, low
> threshold and high threshold. When CPU utilization is below low threshold,
> the user configurable floor latency control frequency can be used by the
> system. When CPU utilization is above high threshold, the uncore frequency
> is increased in 100MHz steps until power limit is reached.
> 
> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> ---
> v2:
>   * Converted a long sequence of if (...)'s to a switch
> 
>  .../uncore-frequency-common.h                 |   4 +
>  .../uncore-frequency/uncore-frequency-tpmi.c  | 158 +++++++++++++++++-
>  2 files changed, 160 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
> index 4c245b945e4e..b5c7311bfa05 100644
> --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
> +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h
> @@ -70,6 +70,10 @@ enum uncore_index {
>  	UNCORE_INDEX_MIN_FREQ,
>  	UNCORE_INDEX_MAX_FREQ,
>  	UNCORE_INDEX_CURRENT_FREQ,
> +	UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD,
> +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD,
> +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE,
> +	UNCORE_INDEX_EFF_LAT_CTRL_FREQ,
>  };
>  
>  int uncore_freq_common_init(int (*read)(struct uncore_data *data, unsigned int *value,
> diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> index 9fa3037c03d1..50b28b4b1fc0 100644
> --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> @@ -30,6 +30,7 @@
>  
>  #define	UNCORE_MAJOR_VERSION		0
>  #define	UNCORE_MINOR_VERSION		2
> +#define UNCORE_ELC_SUPPORTED_VERSION	2
>  #define UNCORE_HEADER_INDEX		0
>  #define UNCORE_FABRIC_CLUSTER_OFFSET	8
>  
> @@ -46,6 +47,7 @@ struct tpmi_uncore_struct;
>  /* Information for each cluster */
>  struct tpmi_uncore_cluster_info {
>  	bool root_domain;
> +	bool elc_supported;
>  	u8 __iomem *cluster_base;
>  	struct uncore_data uncore_data;
>  	struct tpmi_uncore_struct *uncore_root;
> @@ -75,6 +77,10 @@ struct tpmi_uncore_struct {
>  /* Bit definitions for CONTROL register */
>  #define UNCORE_MAX_RATIO_MASK				GENMASK_ULL(14, 8)
>  #define UNCORE_MIN_RATIO_MASK				GENMASK_ULL(21, 15)
> +#define UNCORE_EFF_LAT_CTRL_RATIO_MASK			GENMASK_ULL(28, 22)
> +#define UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK		GENMASK_ULL(38, 32)
> +#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE	BIT(39)
> +#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK		GENMASK_ULL(46, 40)
>  
>  /* Helper function to read MMIO offset for max/min control frequency */
>  static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info,
> @@ -89,6 +95,48 @@ static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info,
>  		*value = FIELD_GET(UNCORE_MIN_RATIO_MASK, control) * UNCORE_FREQ_KHZ_MULTIPLIER;
>  }
>  
> +/* Helper function to read efficiency latency control values over MMIO */
> +static int read_eff_lat_ctrl(struct uncore_data *data, unsigned int *val, enum uncore_index index)
> +{
> +	struct tpmi_uncore_cluster_info *cluster_info;
> +	u64 ctrl;
> +
> +	cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data);
> +	if (cluster_info->root_domain)
> +		return -ENODATA;
> +
> +	if (!cluster_info->elc_supported)
> +		return -EOPNOTSUPP;
> +
> +	ctrl = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
> +
> +	switch (index) {
> +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> +		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, ctrl);
> +		*val *= 100;
> +		*val = DIV_ROUND_UP(*val, FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK));
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> +		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, ctrl);
> +		*val *= 100;
> +		*val = DIV_ROUND_UP(*val, FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK));

I wonder if DIV_ROUND_CLOSEST() would be more appropriate in these two 
cases, rounding up isn't well justified as I think this wants to round it 
back to the original number to deal with the minor divergences due to 
precision loss during conversions?

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

-- 
 i.

> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
> +		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE, ctrl);
> +		break;
> +	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
> +		*val = FIELD_GET(UNCORE_EFF_LAT_CTRL_RATIO_MASK, ctrl) * UNCORE_FREQ_KHZ_MULTIPLIER;
> +		break;
> +
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
>  #define UNCORE_MAX_RATIO	FIELD_MAX(UNCORE_MAX_RATIO_MASK)
>  
>  /* Helper for sysfs read for max/min frequencies. Called under mutex locks */
> @@ -137,6 +185,82 @@ static int uncore_read_control_freq(struct uncore_data *data, unsigned int *valu
>  	return 0;
>  }
>  
> +/* Helper function for writing efficiency latency control values over MMIO */
> +static int write_eff_lat_ctrl(struct uncore_data *data, unsigned int val, enum uncore_index index)
> +{
> +	struct tpmi_uncore_cluster_info *cluster_info;
> +	u64 control;
> +
> +	cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data);
> +
> +	if (cluster_info->root_domain)
> +		return -ENODATA;
> +
> +	if (!cluster_info->elc_supported)
> +		return -EOPNOTSUPP;
> +
> +	switch (index) {
> +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> +		if (val > 100)
> +			return -EINVAL;
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> +		if (val > 100)
> +			return -EINVAL;
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
> +		if (val > 1)
> +			return -EINVAL;
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
> +		val /= UNCORE_FREQ_KHZ_MULTIPLIER;
> +		if (val > FIELD_MAX(UNCORE_EFF_LAT_CTRL_RATIO_MASK))
> +			return -EINVAL;
> +		break;
> +
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	control = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
> +
> +	switch (index) {
> +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> +		val *= FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK);
> +		val /= 100;
> +		control &= ~UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK;
> +		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, val);
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> +		val *= FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK);
> +		val /= 100;
> +		control &= ~UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK;
> +		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, val);
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
> +		control &= ~UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE;
> +		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE, val);
> +		break;
> +
> +	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
> +		control &= ~UNCORE_EFF_LAT_CTRL_RATIO_MASK;
> +		control |= FIELD_PREP(UNCORE_EFF_LAT_CTRL_RATIO_MASK, val);
> +		break;
> +
> +	default:
> +		break;
> +	}
> +
> +	writeq(control, cluster_info->cluster_base + UNCORE_CONTROL_INDEX);
> +
> +	return 0;
> +}
> +
>  /* Helper function to write MMIO offset for max/min control frequency */
>  static void write_control_freq(struct tpmi_uncore_cluster_info *cluster_info, unsigned int input,
>  			      unsigned int index)
> @@ -156,7 +280,7 @@ static void write_control_freq(struct tpmi_uncore_cluster_info *cluster_info, un
>  	writeq(control, (cluster_info->cluster_base + UNCORE_CONTROL_INDEX));
>  }
>  
> -/* Callback for sysfs write for max/min frequencies. Called under mutex locks */
> +/* Helper for sysfs write for max/min frequencies. Called under mutex locks */
>  static int uncore_write_control_freq(struct uncore_data *data, unsigned int input,
>  				     enum uncore_index index)
>  {
> @@ -234,6 +358,33 @@ static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncor
>  	case UNCORE_INDEX_CURRENT_FREQ:
>  		return uncore_read_freq(data, value);
>  
> +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
> +		return read_eff_lat_ctrl(data, value, index);
> +
> +	default:
> +		break;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +/* Callback for sysfs write for TPMI uncore data. Called under mutex locks. */
> +static int uncore_write(struct uncore_data *data, unsigned int value, enum uncore_index index)
> +{
> +	switch (index) {
> +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE:
> +	case UNCORE_INDEX_EFF_LAT_CTRL_FREQ:
> +		return write_eff_lat_ctrl(data, value, index);
> +
> +	case UNCORE_INDEX_MIN_FREQ:
> +	case UNCORE_INDEX_MAX_FREQ:
> +		return uncore_write_control_freq(data, value, index);
> +
>  	default:
>  		break;
>  	}
> @@ -291,7 +442,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
>  		return -EINVAL;
>  
>  	/* Register callbacks to uncore core */
> -	ret = uncore_freq_common_init(uncore_read, uncore_write_control_freq);
> +	ret = uncore_freq_common_init(uncore_read, uncore_write);
>  	if (ret)
>  		return ret;
>  
> @@ -409,6 +560,9 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
>  
>  			cluster_info->uncore_root = tpmi_uncore;
>  
> +			if (TPMI_MINOR_VERSION(pd_info->ufs_header_ver) >= UNCORE_ELC_SUPPORTED_VERSION)
> +				cluster_info->elc_supported = true;
> +
>  			ret = uncore_freq_add_entry(&cluster_info->uncore_data, 0);
>  			if (ret) {
>  				cluster_info->cluster_base = NULL;
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-28 15:34 ` [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation Tero Kristo
@ 2024-08-29  9:18   ` Ilpo Järvinen
  2024-08-29 11:39     ` Tero Kristo
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2024-08-29  9:18 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

[-- Attachment #1: Type: text/plain, Size: 4064 bytes --]

On Wed, 28 Aug 2024, Tero Kristo wrote:

> Added documentation about the functionality of efficiency vs. latency tradeoff
> control in intel Xeon processors, and how this is configured via sysfs.
> 
> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> ---
> v2:
>   * Largely re-wrote the documentation
> 
>  .../pm/intel_uncore_frequency_scaling.rst     | 59 +++++++++++++++++++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
> index 5ab3440e6cee..26ded32b06f5 100644
> --- a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
> +++ b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
> @@ -113,3 +113,62 @@ to apply at each uncore* level.
>  
>  Support for "current_freq_khz" is available only at each fabric cluster
>  level (i.e., in uncore* directory).
> +
> +Efficiency vs. Latency Tradeoff
> +-------------------------------
> +
> +The Efficiency Latency Control (ELC) feature improves performance
> +per watt. With this feature hardware power management algorithms
> +optimize trade-off between latency and power consumption. For some
> +latency sensitive workloads further tuning can be done by SW to
> +get desired performance.
> +
> +The hardware monitors the average CPU utilization across all cores
> +in a power domain at regular intervals and decides an uncore frequency.
> +While this may result in the best performance per watt, workload may be
> +expecting higher performance at the expense of power. Consider an
> +application that intermittently wakes up to perform memory reads on an
> +otherwise idle system. In such cases, if hardware lowers uncore
> +frequency, then there may be delay in ramp up of frequency to meet
> +target performance.
> +
> +The ELC control defines some parameters which can be changed from SW.
> +If the average CPU utilization is below a user defined threshold
> +(elc_low_threshold_percent attribute below), the user defined uncore
> +frequency floor frequency will be used (elc_floor_freq_khz attribute

Consider the following simplification:

"the user defined uncore frequency floor frequency" ->
"the user-defined uncore floor frequency"

I think it tells the same even without that first "frequency".

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

-- 
 i.

> +below) instead of hardware calculated minimum.
> +
> +Similarly in high load scenario where the CPU utilization goes above
> +the high threshold value (elc_high_threshold_percent attribute below)
> +instead of jumping to maximum uncore frequency, frequency is increased
> +in 100MHz steps. This avoids consuming unnecessarily high power
> +immediately with CPU utilization spikes.
> +
> +Attributes for efficiency latency control:
> +
> +``elc_floor_freq_khz``
> +	This attribute is used to get/set the efficiency latency floor frequency.
> +	If this variable is lower than the 'min_freq_khz', it is ignored by
> +	the firmware.
> +
> +``elc_low_threshold_percent``
> +	This attribute is used to get/set the efficiency latency control low
> +	threshold. This attribute is in percentages of CPU utilization.
> +
> +``elc_high_threshold_percent``
> +	This attribute is used to get/set the efficiency latency control high
> +	threshold. This attribute is in percentages of CPU utilization.
> +
> +``elc_high_threshold_enable``
> +	This attribute is used to enable/disable the efficiency latency control
> +	high threshold. Write '1' to enable, '0' to disable.
> +
> +Example system configuration below, which does following:
> +  * when CPU utilization is less than 10%: sets uncore frequency to 800MHz
> +  * when CPU utilization is higher than 95%: increases uncore frequency in
> +    100MHz steps, until power limit is reached
> +
> +  elc_floor_freq_khz:800000
> +  elc_high_threshold_percent:95
> +  elc_high_threshold_enable:1
> +  elc_low_threshold_percent:10
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-29  9:18   ` Ilpo Järvinen
@ 2024-08-29 11:39     ` Tero Kristo
  2024-08-30  7:23       ` Tero Kristo
  0 siblings, 1 reply; 12+ messages in thread
From: Tero Kristo @ 2024-08-29 11:39 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

On Thu, 2024-08-29 at 12:18 +0300, Ilpo Järvinen wrote:
> On Wed, 28 Aug 2024, Tero Kristo wrote:
> 
> > Added documentation about the functionality of efficiency vs.
> > latency tradeoff
> > control in intel Xeon processors, and how this is configured via
> > sysfs.
> > 
> > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> > ---
> > v2:
> >   * Largely re-wrote the documentation
> > 
> >  .../pm/intel_uncore_frequency_scaling.rst     | 59
> > +++++++++++++++++++
> >  1 file changed, 59 insertions(+)
> > 
> > diff --git a/Documentation/admin-
> > guide/pm/intel_uncore_frequency_scaling.rst b/Documentation/admin-
> > guide/pm/intel_uncore_frequency_scaling.rst
> > index 5ab3440e6cee..26ded32b06f5 100644
> > --- a/Documentation/admin-
> > guide/pm/intel_uncore_frequency_scaling.rst
> > +++ b/Documentation/admin-
> > guide/pm/intel_uncore_frequency_scaling.rst
> > @@ -113,3 +113,62 @@ to apply at each uncore* level.
> >  
> >  Support for "current_freq_khz" is available only at each fabric
> > cluster
> >  level (i.e., in uncore* directory).
> > +
> > +Efficiency vs. Latency Tradeoff
> > +-------------------------------
> > +
> > +The Efficiency Latency Control (ELC) feature improves performance
> > +per watt. With this feature hardware power management algorithms
> > +optimize trade-off between latency and power consumption. For some
> > +latency sensitive workloads further tuning can be done by SW to
> > +get desired performance.
> > +
> > +The hardware monitors the average CPU utilization across all cores
> > +in a power domain at regular intervals and decides an uncore
> > frequency.
> > +While this may result in the best performance per watt, workload
> > may be
> > +expecting higher performance at the expense of power. Consider an
> > +application that intermittently wakes up to perform memory reads
> > on an
> > +otherwise idle system. In such cases, if hardware lowers uncore
> > +frequency, then there may be delay in ramp up of frequency to meet
> > +target performance.
> > +
> > +The ELC control defines some parameters which can be changed from
> > SW.
> > +If the average CPU utilization is below a user defined threshold
> > +(elc_low_threshold_percent attribute below), the user defined
> > uncore
> > +frequency floor frequency will be used (elc_floor_freq_khz
> > attribute
> 
> Consider the following simplification:
> 
> "the user defined uncore frequency floor frequency" ->
> "the user-defined uncore floor frequency"
> 
> I think it tells the same even without that first "frequency".
> 
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> 

Yeah, it looks kind of silly. I think that's just a typo from my side,
thanks for catching.

-Tero

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control
  2024-08-29  9:14   ` Ilpo Järvinen
@ 2024-08-30  7:21     ` Tero Kristo
  2024-08-30 10:09       ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Tero Kristo @ 2024-08-30  7:21 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

On Thu, 2024-08-29 at 12:14 +0300, Ilpo Järvinen wrote:
> On Wed, 28 Aug 2024, Tero Kristo wrote:
> 
> > Add efficiency latency control support to the TPMI uncore driver.
> > This
> > defines two new threshold values for controlling uncore frequency,
> > low
> > threshold and high threshold. When CPU utilization is below low
> > threshold,
> > the user configurable floor latency control frequency can be used
> > by the
> > system. When CPU utilization is above high threshold, the uncore
> > frequency
> > is increased in 100MHz steps until power limit is reached.
> > 
> > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> > ---
> > v2:
> >   * Converted a long sequence of if (...)'s to a switch
> > 
> >  .../uncore-frequency-common.h                 |   4 +
> >  .../uncore-frequency/uncore-frequency-tpmi.c  | 158
> > +++++++++++++++++-
> >  2 files changed, 160 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-
> > frequency-common.h b/drivers/platform/x86/intel/uncore-
> > frequency/uncore-frequency-common.h
> > index 4c245b945e4e..b5c7311bfa05 100644
> > --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > common.h
> > +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > common.h
> > @@ -70,6 +70,10 @@ enum uncore_index {
> >  	UNCORE_INDEX_MIN_FREQ,
> >  	UNCORE_INDEX_MAX_FREQ,
> >  	UNCORE_INDEX_CURRENT_FREQ,
> > +	UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD,
> > +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD,
> > +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE,
> > +	UNCORE_INDEX_EFF_LAT_CTRL_FREQ,
> >  };
> >  
> >  int uncore_freq_common_init(int (*read)(struct uncore_data *data,
> > unsigned int *value,
> > diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-
> > frequency-tpmi.c b/drivers/platform/x86/intel/uncore-
> > frequency/uncore-frequency-tpmi.c
> > index 9fa3037c03d1..50b28b4b1fc0 100644
> > --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > tpmi.c
> > +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > tpmi.c
> > @@ -30,6 +30,7 @@
> >  
> >  #define	UNCORE_MAJOR_VERSION		0
> >  #define	UNCORE_MINOR_VERSION		2
> > +#define UNCORE_ELC_SUPPORTED_VERSION	2
> >  #define UNCORE_HEADER_INDEX		0
> >  #define UNCORE_FABRIC_CLUSTER_OFFSET	8
> >  
> > @@ -46,6 +47,7 @@ struct tpmi_uncore_struct;
> >  /* Information for each cluster */
> >  struct tpmi_uncore_cluster_info {
> >  	bool root_domain;
> > +	bool elc_supported;
> >  	u8 __iomem *cluster_base;
> >  	struct uncore_data uncore_data;
> >  	struct tpmi_uncore_struct *uncore_root;
> > @@ -75,6 +77,10 @@ struct tpmi_uncore_struct {
> >  /* Bit definitions for CONTROL register */
> >  #define
> > UNCORE_MAX_RATIO_MASK				GENMASK_ULL(14, 8)
> >  #define
> > UNCORE_MIN_RATIO_MASK				GENMASK_ULL(21, 15)
> > +#define
> > UNCORE_EFF_LAT_CTRL_RATIO_MASK			GENMASK_ULL(28, 22)
> > +#define
> > UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK		GENMASK_ULL(38, 32)
> > +#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE	BIT(39)
> > +#define
> > UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK		GENMASK_ULL(46, 40)
> >  
> >  /* Helper function to read MMIO offset for max/min control
> > frequency */
> >  static void read_control_freq(struct tpmi_uncore_cluster_info
> > *cluster_info,
> > @@ -89,6 +95,48 @@ static void read_control_freq(struct
> > tpmi_uncore_cluster_info *cluster_info,
> >  		*value = FIELD_GET(UNCORE_MIN_RATIO_MASK, control)
> > * UNCORE_FREQ_KHZ_MULTIPLIER;
> >  }
> >  
> > +/* Helper function to read efficiency latency control values over
> > MMIO */
> > +static int read_eff_lat_ctrl(struct uncore_data *data, unsigned
> > int *val, enum uncore_index index)
> > +{
> > +	struct tpmi_uncore_cluster_info *cluster_info;
> > +	u64 ctrl;
> > +
> > +	cluster_info = container_of(data, struct
> > tpmi_uncore_cluster_info, uncore_data);
> > +	if (cluster_info->root_domain)
> > +		return -ENODATA;
> > +
> > +	if (!cluster_info->elc_supported)
> > +		return -EOPNOTSUPP;
> > +
> > +	ctrl = readq(cluster_info->cluster_base +
> > UNCORE_CONTROL_INDEX);
> > +
> > +	switch (index) {
> > +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> > +		*val =
> > FIELD_GET(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, ctrl);
> > +		*val *= 100;
> > +		*val = DIV_ROUND_UP(*val,
> > FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK));
> > +		break;
> > +
> > +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> > +		*val =
> > FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, ctrl);
> > +		*val *= 100;
> > +		*val = DIV_ROUND_UP(*val,
> > FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK));
> 
> I wonder if DIV_ROUND_CLOSEST() would be more appropriate in these
> two 
> cases, rounding up isn't well justified as I think this wants to
> round it 
> back to the original number to deal with the minor divergences due to
> precision loss during conversions?

Yes, this makes it sure that what is written to the file gets also read
back as the same value. Using DIV_ROUND_CLOSEST() will not do the
trick. Tried this out by using DIV_ROUND_CLOSEST() and got following:

# echo 94 > elc_high_threshold_percent 
# cat elc_high_threshold_percent
94
# echo 95 > elc_high_threshold_percent 
# cat elc_high_threshold_percent 
94
# echo 96 > elc_high_threshold_percent 
# cat elc_high_threshold_percent 
95

However, using DIV_ROUND_UP() all values from 0-100 work just fine.

-Tero

> 
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-29 11:39     ` Tero Kristo
@ 2024-08-30  7:23       ` Tero Kristo
  2024-08-30 10:12         ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Tero Kristo @ 2024-08-30  7:23 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

   1. On Thu, 2024-08-29 at 14:39 +0300, Tero Kristo wrote:
> On Thu, 2024-08-29 at 12:18 +0300, Ilpo Järvinen wrote:
> > On Wed, 28 Aug 2024, Tero Kristo wrote:
> > 
> > > Added documentation about the functionality of efficiency vs.
> > > latency tradeoff
> > > control in intel Xeon processors, and how this is configured via
> > > sysfs.
> > > 
> > > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> > > ---
> > > v2:
> > >   * Largely re-wrote the documentation
> > > 
> > >  .../pm/intel_uncore_frequency_scaling.rst     | 59
> > > +++++++++++++++++++
> > >  1 file changed, 59 insertions(+)
> > > 
> > > diff --git a/Documentation/admin-
> > > guide/pm/intel_uncore_frequency_scaling.rst
> > > b/Documentation/admin-
> > > guide/pm/intel_uncore_frequency_scaling.rst
> > > index 5ab3440e6cee..26ded32b06f5 100644
> > > --- a/Documentation/admin-
> > > guide/pm/intel_uncore_frequency_scaling.rst
> > > +++ b/Documentation/admin-
> > > guide/pm/intel_uncore_frequency_scaling.rst
> > > @@ -113,3 +113,62 @@ to apply at each uncore* level.
> > >  
> > >  Support for "current_freq_khz" is available only at each fabric
> > > cluster
> > >  level (i.e., in uncore* directory).
> > > +
> > > +Efficiency vs. Latency Tradeoff
> > > +-------------------------------
> > > +
> > > +The Efficiency Latency Control (ELC) feature improves
> > > performance
> > > +per watt. With this feature hardware power management algorithms
> > > +optimize trade-off between latency and power consumption. For
> > > some
> > > +latency sensitive workloads further tuning can be done by SW to
> > > +get desired performance.
> > > +
> > > +The hardware monitors the average CPU utilization across all
> > > cores
> > > +in a power domain at regular intervals and decides an uncore
> > > frequency.
> > > +While this may result in the best performance per watt, workload
> > > may be
> > > +expecting higher performance at the expense of power. Consider
> > > an
> > > +application that intermittently wakes up to perform memory reads
> > > on an
> > > +otherwise idle system. In such cases, if hardware lowers uncore
> > > +frequency, then there may be delay in ramp up of frequency to
> > > meet
> > > +target performance.
> > > +
> > > +The ELC control defines some parameters which can be changed
> > > from
> > > SW.
> > > +If the average CPU utilization is below a user defined threshold
> > > +(elc_low_threshold_percent attribute below), the user defined
> > > uncore
> > > +frequency floor frequency will be used (elc_floor_freq_khz
> > > attribute
> > 
> > Consider the following simplification:
> > 
> > "the user defined uncore frequency floor frequency" ->
> > "the user-defined uncore floor frequency"
> > 
> > I think it tells the same even without that first "frequency".
> > 
> > Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > 
> 
> Yeah, it looks kind of silly. I think that's just a typo from my
> side,
> thanks for catching.

Do you want me to send a new version of this patch or do you fix it
locally? Rest of the patches don't seem to need any changes atm.

-Tero


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control
  2024-08-30  7:21     ` Tero Kristo
@ 2024-08-30 10:09       ` Ilpo Järvinen
  0 siblings, 0 replies; 12+ messages in thread
From: Ilpo Järvinen @ 2024-08-30 10:09 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Hans de Goede, srinivas.pandruvada, platform-driver-x86, LKML

[-- Attachment #1: Type: text/plain, Size: 5998 bytes --]

On Fri, 30 Aug 2024, Tero Kristo wrote:

> On Thu, 2024-08-29 at 12:14 +0300, Ilpo Järvinen wrote:
> > On Wed, 28 Aug 2024, Tero Kristo wrote:
> > 
> > > Add efficiency latency control support to the TPMI uncore driver.
> > > This
> > > defines two new threshold values for controlling uncore frequency,
> > > low
> > > threshold and high threshold. When CPU utilization is below low
> > > threshold,
> > > the user configurable floor latency control frequency can be used
> > > by the
> > > system. When CPU utilization is above high threshold, the uncore
> > > frequency
> > > is increased in 100MHz steps until power limit is reached.
> > > 
> > > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> > > ---
> > > v2:
> > >   * Converted a long sequence of if (...)'s to a switch
> > > 
> > >  .../uncore-frequency-common.h                 |   4 +
> > >  .../uncore-frequency/uncore-frequency-tpmi.c  | 158
> > > +++++++++++++++++-
> > >  2 files changed, 160 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-
> > > frequency-common.h b/drivers/platform/x86/intel/uncore-
> > > frequency/uncore-frequency-common.h
> > > index 4c245b945e4e..b5c7311bfa05 100644
> > > --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > > common.h
> > > +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > > common.h
> > > @@ -70,6 +70,10 @@ enum uncore_index {
> > >  	UNCORE_INDEX_MIN_FREQ,
> > >  	UNCORE_INDEX_MAX_FREQ,
> > >  	UNCORE_INDEX_CURRENT_FREQ,
> > > +	UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD,
> > > +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD,
> > > +	UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE,
> > > +	UNCORE_INDEX_EFF_LAT_CTRL_FREQ,
> > >  };
> > >  
> > >  int uncore_freq_common_init(int (*read)(struct uncore_data *data,
> > > unsigned int *value,
> > > diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-
> > > frequency-tpmi.c b/drivers/platform/x86/intel/uncore-
> > > frequency/uncore-frequency-tpmi.c
> > > index 9fa3037c03d1..50b28b4b1fc0 100644
> > > --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > > tpmi.c
> > > +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > > tpmi.c
> > > @@ -30,6 +30,7 @@
> > >  
> > >  #define	UNCORE_MAJOR_VERSION		0
> > >  #define	UNCORE_MINOR_VERSION		2
> > > +#define UNCORE_ELC_SUPPORTED_VERSION	2
> > >  #define UNCORE_HEADER_INDEX		0
> > >  #define UNCORE_FABRIC_CLUSTER_OFFSET	8
> > >  
> > > @@ -46,6 +47,7 @@ struct tpmi_uncore_struct;
> > >  /* Information for each cluster */
> > >  struct tpmi_uncore_cluster_info {
> > >  	bool root_domain;
> > > +	bool elc_supported;
> > >  	u8 __iomem *cluster_base;
> > >  	struct uncore_data uncore_data;
> > >  	struct tpmi_uncore_struct *uncore_root;
> > > @@ -75,6 +77,10 @@ struct tpmi_uncore_struct {
> > >  /* Bit definitions for CONTROL register */
> > >  #define
> > > UNCORE_MAX_RATIO_MASK				GENMASK_ULL(14, 8)
> > >  #define
> > > UNCORE_MIN_RATIO_MASK				GENMASK_ULL(21, 15)
> > > +#define
> > > UNCORE_EFF_LAT_CTRL_RATIO_MASK			GENMASK_ULL(28, 22)
> > > +#define
> > > UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK		GENMASK_ULL(38, 32)
> > > +#define UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_ENABLE	BIT(39)
> > > +#define
> > > UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK		GENMASK_ULL(46, 40)
> > >  
> > >  /* Helper function to read MMIO offset for max/min control
> > > frequency */
> > >  static void read_control_freq(struct tpmi_uncore_cluster_info
> > > *cluster_info,
> > > @@ -89,6 +95,48 @@ static void read_control_freq(struct
> > > tpmi_uncore_cluster_info *cluster_info,
> > >  		*value = FIELD_GET(UNCORE_MIN_RATIO_MASK, control)
> > > * UNCORE_FREQ_KHZ_MULTIPLIER;
> > >  }
> > >  
> > > +/* Helper function to read efficiency latency control values over
> > > MMIO */
> > > +static int read_eff_lat_ctrl(struct uncore_data *data, unsigned
> > > int *val, enum uncore_index index)
> > > +{
> > > +	struct tpmi_uncore_cluster_info *cluster_info;
> > > +	u64 ctrl;
> > > +
> > > +	cluster_info = container_of(data, struct
> > > tpmi_uncore_cluster_info, uncore_data);
> > > +	if (cluster_info->root_domain)
> > > +		return -ENODATA;
> > > +
> > > +	if (!cluster_info->elc_supported)
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	ctrl = readq(cluster_info->cluster_base +
> > > UNCORE_CONTROL_INDEX);
> > > +
> > > +	switch (index) {
> > > +	case UNCORE_INDEX_EFF_LAT_CTRL_LOW_THRESHOLD:
> > > +		*val =
> > > FIELD_GET(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK, ctrl);
> > > +		*val *= 100;
> > > +		*val = DIV_ROUND_UP(*val,
> > > FIELD_MAX(UNCORE_EFF_LAT_CTRL_LOW_THRESHOLD_MASK));
> > > +		break;
> > > +
> > > +	case UNCORE_INDEX_EFF_LAT_CTRL_HIGH_THRESHOLD:
> > > +		*val =
> > > FIELD_GET(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK, ctrl);
> > > +		*val *= 100;
> > > +		*val = DIV_ROUND_UP(*val,
> > > FIELD_MAX(UNCORE_EFF_LAT_CTRL_HIGH_THRESHOLD_MASK));
> > 
> > I wonder if DIV_ROUND_CLOSEST() would be more appropriate in these
> > two 
> > cases, rounding up isn't well justified as I think this wants to
> > round it 
> > back to the original number to deal with the minor divergences due to
> > precision loss during conversions?
> 
> Yes, this makes it sure that what is written to the file gets also read
> back as the same value. Using DIV_ROUND_CLOSEST() will not do the
> trick. Tried this out by using DIV_ROUND_CLOSEST() and got following:
> 
> # echo 94 > elc_high_threshold_percent 
> # cat elc_high_threshold_percent
> 94
> # echo 95 > elc_high_threshold_percent 
> # cat elc_high_threshold_percent 
> 94
> # echo 96 > elc_high_threshold_percent 
> # cat elc_high_threshold_percent 
> 95
> 
> However, using DIV_ROUND_UP() all values from 0-100 work just fine.

Okay, thanks for checking this out.

-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-30  7:23       ` Tero Kristo
@ 2024-08-30 10:12         ` Ilpo Järvinen
  2024-09-04 17:51           ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2024-08-30 10:12 UTC (permalink / raw)
  To: Tero Kristo, Hans de Goede; +Cc: srinivas.pandruvada, platform-driver-x86, LKML

[-- Attachment #1: Type: text/plain, Size: 3621 bytes --]

On Fri, 30 Aug 2024, Tero Kristo wrote:

>    1. On Thu, 2024-08-29 at 14:39 +0300, Tero Kristo wrote:
> > On Thu, 2024-08-29 at 12:18 +0300, Ilpo Järvinen wrote:
> > > On Wed, 28 Aug 2024, Tero Kristo wrote:
> > > 
> > > > Added documentation about the functionality of efficiency vs.
> > > > latency tradeoff
> > > > control in intel Xeon processors, and how this is configured via
> > > > sysfs.
> > > > 
> > > > Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
> > > > ---
> > > > v2:
> > > >   * Largely re-wrote the documentation
> > > > 
> > > >  .../pm/intel_uncore_frequency_scaling.rst     | 59
> > > > +++++++++++++++++++
> > > >  1 file changed, 59 insertions(+)
> > > > 
> > > > diff --git a/Documentation/admin-
> > > > guide/pm/intel_uncore_frequency_scaling.rst
> > > > b/Documentation/admin-
> > > > guide/pm/intel_uncore_frequency_scaling.rst
> > > > index 5ab3440e6cee..26ded32b06f5 100644
> > > > --- a/Documentation/admin-
> > > > guide/pm/intel_uncore_frequency_scaling.rst
> > > > +++ b/Documentation/admin-
> > > > guide/pm/intel_uncore_frequency_scaling.rst
> > > > @@ -113,3 +113,62 @@ to apply at each uncore* level.
> > > >  
> > > >  Support for "current_freq_khz" is available only at each fabric
> > > > cluster
> > > >  level (i.e., in uncore* directory).
> > > > +
> > > > +Efficiency vs. Latency Tradeoff
> > > > +-------------------------------
> > > > +
> > > > +The Efficiency Latency Control (ELC) feature improves
> > > > performance
> > > > +per watt. With this feature hardware power management algorithms
> > > > +optimize trade-off between latency and power consumption. For
> > > > some
> > > > +latency sensitive workloads further tuning can be done by SW to
> > > > +get desired performance.
> > > > +
> > > > +The hardware monitors the average CPU utilization across all
> > > > cores
> > > > +in a power domain at regular intervals and decides an uncore
> > > > frequency.
> > > > +While this may result in the best performance per watt, workload
> > > > may be
> > > > +expecting higher performance at the expense of power. Consider
> > > > an
> > > > +application that intermittently wakes up to perform memory reads
> > > > on an
> > > > +otherwise idle system. In such cases, if hardware lowers uncore
> > > > +frequency, then there may be delay in ramp up of frequency to
> > > > meet
> > > > +target performance.
> > > > +
> > > > +The ELC control defines some parameters which can be changed
> > > > from
> > > > SW.
> > > > +If the average CPU utilization is below a user defined threshold
> > > > +(elc_low_threshold_percent attribute below), the user defined
> > > > uncore
> > > > +frequency floor frequency will be used (elc_floor_freq_khz
> > > > attribute
> > > 
> > > Consider the following simplification:
> > > 
> > > "the user defined uncore frequency floor frequency" ->
> > > "the user-defined uncore floor frequency"
> > > 
> > > I think it tells the same even without that first "frequency".
> > > 
> > > Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > 
> > 
> > Yeah, it looks kind of silly. I think that's just a typo from my
> > side,
> > thanks for catching.
> 
> Do you want me to send a new version of this patch or do you fix it
> locally? Rest of the patches don't seem to need any changes atm.

That's up to Hans but that looks trivial change so probably he can fix
that while applying.

Hans, v2 of this series seems ready to go (with the small change into
the documentation patch as discussed above).

-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation
  2024-08-30 10:12         ` Ilpo Järvinen
@ 2024-09-04 17:51           ` Hans de Goede
  0 siblings, 0 replies; 12+ messages in thread
From: Hans de Goede @ 2024-09-04 17:51 UTC (permalink / raw)
  To: Ilpo Järvinen, Tero Kristo
  Cc: srinivas.pandruvada, platform-driver-x86, LKML

Hi,

On 8/30/24 12:12 PM, Ilpo Järvinen wrote:
> On Fri, 30 Aug 2024, Tero Kristo wrote:
> 
>>    1. On Thu, 2024-08-29 at 14:39 +0300, Tero Kristo wrote:
>>> On Thu, 2024-08-29 at 12:18 +0300, Ilpo Järvinen wrote:
>>>> On Wed, 28 Aug 2024, Tero Kristo wrote:
>>>>
>>>>> Added documentation about the functionality of efficiency vs.
>>>>> latency tradeoff
>>>>> control in intel Xeon processors, and how this is configured via
>>>>> sysfs.
>>>>>
>>>>> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
>>>>> ---
>>>>> v2:
>>>>>   * Largely re-wrote the documentation
>>>>>
>>>>>  .../pm/intel_uncore_frequency_scaling.rst     | 59
>>>>> +++++++++++++++++++
>>>>>  1 file changed, 59 insertions(+)
>>>>>
>>>>> diff --git a/Documentation/admin-
>>>>> guide/pm/intel_uncore_frequency_scaling.rst
>>>>> b/Documentation/admin-
>>>>> guide/pm/intel_uncore_frequency_scaling.rst
>>>>> index 5ab3440e6cee..26ded32b06f5 100644
>>>>> --- a/Documentation/admin-
>>>>> guide/pm/intel_uncore_frequency_scaling.rst
>>>>> +++ b/Documentation/admin-
>>>>> guide/pm/intel_uncore_frequency_scaling.rst
>>>>> @@ -113,3 +113,62 @@ to apply at each uncore* level.
>>>>>  
>>>>>  Support for "current_freq_khz" is available only at each fabric
>>>>> cluster
>>>>>  level (i.e., in uncore* directory).
>>>>> +
>>>>> +Efficiency vs. Latency Tradeoff
>>>>> +-------------------------------
>>>>> +
>>>>> +The Efficiency Latency Control (ELC) feature improves
>>>>> performance
>>>>> +per watt. With this feature hardware power management algorithms
>>>>> +optimize trade-off between latency and power consumption. For
>>>>> some
>>>>> +latency sensitive workloads further tuning can be done by SW to
>>>>> +get desired performance.
>>>>> +
>>>>> +The hardware monitors the average CPU utilization across all
>>>>> cores
>>>>> +in a power domain at regular intervals and decides an uncore
>>>>> frequency.
>>>>> +While this may result in the best performance per watt, workload
>>>>> may be
>>>>> +expecting higher performance at the expense of power. Consider
>>>>> an
>>>>> +application that intermittently wakes up to perform memory reads
>>>>> on an
>>>>> +otherwise idle system. In such cases, if hardware lowers uncore
>>>>> +frequency, then there may be delay in ramp up of frequency to
>>>>> meet
>>>>> +target performance.
>>>>> +
>>>>> +The ELC control defines some parameters which can be changed
>>>>> from
>>>>> SW.
>>>>> +If the average CPU utilization is below a user defined threshold
>>>>> +(elc_low_threshold_percent attribute below), the user defined
>>>>> uncore
>>>>> +frequency floor frequency will be used (elc_floor_freq_khz
>>>>> attribute
>>>>
>>>> Consider the following simplification:
>>>>
>>>> "the user defined uncore frequency floor frequency" ->
>>>> "the user-defined uncore floor frequency"
>>>>
>>>> I think it tells the same even without that first "frequency".
>>>>
>>>> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>>
>>>
>>> Yeah, it looks kind of silly. I think that's just a typo from my
>>> side,
>>> thanks for catching.
>>
>> Do you want me to send a new version of this patch or do you fix it
>> locally? Rest of the patches don't seem to need any changes atm.
> 
> That's up to Hans but that looks trivial change so probably he can fix
> that while applying.
> 
> Hans, v2 of this series seems ready to go (with the small change into
> the documentation patch as discussed above).

Ack, I've applied the series to my review-hans branch:
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/log/?h=review-hans

with the suggested improvement to intel_uncore_frequency_scaling.rst
sqaushed in.

Note it will show up in my review-hans branch once I've pushed my
local branch there, which might take a while.

Once I've run some tests on this branch the patches there will be
added to the platform-drivers-x86/for-next branch and eventually
will be included in the pdx86 pull-request to Linus for the next
merge-window.

Regards,

Hans




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-09-04 17:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-28 15:34 [PATCH v2 0/3] Intel uncore driver ELC support Tero Kristo
2024-08-28 15:34 ` [PATCH v2 1/3] Documentation: admin-guide: pm: Add efficiency vs. latency tradeoff to uncore documentation Tero Kristo
2024-08-29  9:18   ` Ilpo Järvinen
2024-08-29 11:39     ` Tero Kristo
2024-08-30  7:23       ` Tero Kristo
2024-08-30 10:12         ` Ilpo Järvinen
2024-09-04 17:51           ` Hans de Goede
2024-08-28 15:34 ` [PATCH v2 2/3] platform/x86/intel-uncore-freq: Add support for efficiency latency control Tero Kristo
2024-08-29  9:14   ` Ilpo Järvinen
2024-08-30  7:21     ` Tero Kristo
2024-08-30 10:09       ` Ilpo Järvinen
2024-08-28 15:34 ` [PATCH v2 3/3] platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface Tero Kristo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox