devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/13] thermal: tegra: Do not register cooling device
@ 2023-10-12 17:58 Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 01/13] thermal: Store device tree node for thermal zone devices Thierry Reding
                   ` (15 more replies)
  0 siblings, 16 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Hi,

this set of patches removes the registration of the SOCTHERM internal
throttling mechanism as cooling device. Since this throttling starts
automatically once a certain temperature threshold is crossed, it
doesn't make sense to represent it as a cooling device, which are
typically "manually" activated by the thermal framework when thermal
sensors report temperature thresholds being crossed.

Instead of using the cooling device mechanism, this statically programs
the throttling mechanism when it is configured in device tree. In order
to do this, an additional device tree property is needed to replace the
information that was previously contained in trip points.

There's a few preparatory patches to make the removal a bit simpler and
also some follow up cleanups included as well.

Changes in v2:
- rework the device tree bindings:
  - add nvidia,thermal-zones property to attach throttling to zones
  - use -millicelsius suffix and add hysteresis
- add patch to store thermal zone device tree node for later use
- add patch to enforce self-encapsulation of the thermal core now that
  no drivers need to reach into it anymore

This applies on top of Daniel's self-encapsulation hardening series:

	https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/

Thierry

Thierry Reding (13):
  thermal: Store device tree node for thermal zone devices
  dt-bindings: thermal: tegra: Document throttle temperature
  dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
  thermal: tegra: Use driver-private data consistently
  thermal: tegra: Constify SoC-specific data
  thermal: tegra: Do not register cooling device
  thermal: tegra: Use unsigned int where appropriate
  thermal: tegra: Avoid over-allocation of temporary array
  thermal: tegra: Remove gratuitous error assignment
  thermal: tegra: Minor stylistic cleanups
  ARM: tegra: Rework SOCTHERM on Tegra124
  arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
  thermal: Enforce self-encapsulation

 .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
 arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
 arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
 arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
 drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
 drivers/thermal/tegra/soctherm.h              |   1 +
 drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
 drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
 drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
 drivers/thermal/thermal_core.h                |   2 +-
 drivers/thermal/thermal_of.c                  |   3 +
 11 files changed, 329 insertions(+), 453 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/13] thermal: Store device tree node for thermal zone devices
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature Thierry Reding
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Store the device tree node for each thermal zone device to allow it to
be referenced.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- new patch to track thermal zone device tree nodes

 drivers/thermal/thermal_of.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index db83596ea105..2acb65e97090 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -443,6 +443,7 @@ static void thermal_of_zone_unregister(struct thermal_zone_device *tz)
 	struct thermal_zone_device_ops *ops = tz->ops;
 
 	thermal_zone_device_disable(tz);
+	of_node_put(tz->device.of_node);
 	thermal_zone_device_unregister(tz);
 	kfree(trips);
 	kfree(ops);
@@ -521,6 +522,8 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
 		goto out_kfree_trips;
 	}
 
+	tz->device.of_node = np;
+
 	ret = thermal_zone_device_enable(tz);
 	if (ret) {
 		pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n",
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 01/13] thermal: Store device tree node for thermal zone devices Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-16 14:02   ` Rob Herring
  2023-10-12 17:58 ` [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property Thierry Reding
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Each throttling configuration needs to specify the temperature threshold
at which it should start throttling. Previously this was tied to a given
trip point as a cooling device and used the temperature specified for
that trip point. This doesn't work well because the throttling mechanism
is not a cooling device in the traditional sense.

Instead, allow device trees to specify the throttle temperature in the
throttle configuration directly so that the throttle doesn't need to be
exposed as a cooling device.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- rename temperature to temperature-millicelsius and drop $ref
- add hysteresis-millicelsius property

 .../bindings/thermal/nvidia,tegra124-soctherm.yaml | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
index 04a2ba1aa946..0eb6277082fe 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
@@ -121,6 +121,20 @@ properties:
               # high (85%, TEGRA_SOCTHERM_THROT_LEVEL_HIGH)
               - 3
 
+          temperature-millicelsius:
+            minimum: -273000
+            maximum: 200000
+            description: The temperature threshold (in millicelsius) that,
+              when crossed, will trigger the configured automatic throttling.
+
+          hysteresis-millicelsius:
+            description: An unsigned integer expressing the hysteresis delta
+              (in millicelsius) with respect to the threshold temperature
+              property above. Throttling will be initiated when the
+              temperature falls below (temperature - hysteresis). This avoids
+              situations where throttling is repeatedly initiated and stopped
+              because of minor temperature variations.
+
           # optional
           # Tegra210 specific and valid only for OCx throttle events
           nvidia,count-threshold:
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 01/13] thermal: Store device tree node for thermal zone devices Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-13 15:59   ` Daniel Lezcano
  2023-10-12 17:58 ` [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently Thierry Reding
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The throttle configurations need to be associated with one or more
thermal zones before they can be enabled, so introduce a new property,
called nvidia,thermal-zones, that contains a list of phandles to the
thermal zones for which a throttle configuration is applicable.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- new patch to hook up throttling with thermal zones

 .../bindings/thermal/nvidia,tegra124-soctherm.yaml           | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
index 0eb6277082fe..359344f60a6e 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
@@ -161,6 +161,11 @@ properties:
               throttling is engaged after the OC event is deasserted.
             default: 0
 
+          nvidia,thermal-zones:
+            $ref: /schemas/types.yaml#/definitions/phandle
+            description: A list of phandles to the thermal zones that this
+              throttle configuration applies to.
+
   # optional
   nvidia,thermtrips:
     $ref: /schemas/types.yaml#/definitions/uint32-matrix
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (2 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-13  8:04   ` Daniel Lezcano
  2023-10-12 17:58 ` [PATCH v2 05/13] thermal: tegra: Constify SoC-specific data Thierry Reding
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Instead of passing around platform and plain devices and figuring out
the driver-private data within each helper, directly pass around the
driver-private data when it's available.

Also store a pointer to the parent device in the main driver-private
data structure for easier access.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 213 ++++++++++++++-----------------
 1 file changed, 95 insertions(+), 118 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index e7fe8683bfc5..f434e141dcdf 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -299,7 +299,6 @@ static const char *const throt_names[] = {
 struct tegra_soctherm;
 struct tegra_thermctl_zone {
 	void __iomem *reg;
-	struct device *dev;
 	struct tegra_soctherm *ts;
 	struct thermal_zone_device *tz;
 	const struct tegra_tsensor_group *sg;
@@ -327,6 +326,7 @@ struct soctherm_throt_cfg {
 };
 
 struct tegra_soctherm {
+	struct device *dev;
 	struct reset_control *reset;
 	struct clk *clock_tsensor;
 	struct clk *clock_soctherm;
@@ -444,14 +444,15 @@ static int tegra_thermctl_get_temp(struct thermal_zone_device *tz, int *out_temp
  *
  * Return: The precision adjusted capped temperature in millicelsius.
  */
-static int enforce_temp_range(struct device *dev, int trip_temp)
+static int enforce_temp_range(struct tegra_soctherm *ts, int trip_temp)
 {
 	int temp;
 
 	temp = clamp_val(trip_temp, min_low_temp, max_high_temp);
 	if (temp != trip_temp)
-		dev_dbg(dev, "soctherm: trip temperature %d forced to %d\n",
+		dev_dbg(ts->dev, "trip temperature %d forced to %d\n",
 			trip_temp, temp);
+
 	return temp;
 }
 
@@ -471,18 +472,17 @@ static int enforce_temp_range(struct device *dev, int trip_temp)
  *
  * Return: 0 upon success, or %-EINVAL upon failure.
  */
-static int thermtrip_program(struct device *dev,
+static int thermtrip_program(struct tegra_soctherm *ts,
 			     const struct tegra_tsensor_group *sg,
 			     int trip_temp)
 {
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	int temp;
 	u32 r;
 
 	if (!sg || !sg->thermtrip_threshold_mask)
 		return -EINVAL;
 
-	temp = enforce_temp_range(dev, trip_temp) / ts->soc->thresh_grain;
+	temp = enforce_temp_range(ts, trip_temp) / ts->soc->thresh_grain;
 
 	r = readl(ts->regs + THERMCTL_THERMTRIP_CTL);
 	r = REG_SET_MASK(r, sg->thermtrip_threshold_mask, temp);
@@ -496,7 +496,7 @@ static int thermtrip_program(struct device *dev,
 /**
  * throttrip_program() - Configures the hardware to throttle the
  * pulse if a given sensor group reaches a given temperature
- * @dev: ptr to the struct device for the SOC_THERM IP block
+ * @ts: pointer to a struct tegra_soctherm
  * @sg: pointer to the sensor group to set the thermtrip temperature for
  * @stc: pointer to the throttle need to be triggered
  * @trip_temp: the temperature in millicelsius to trigger the thermal trip at
@@ -510,12 +510,11 @@ static int thermtrip_program(struct device *dev,
  *
  * Return: 0 upon success, or %-EINVAL upon failure.
  */
-static int throttrip_program(struct device *dev,
+static int throttrip_program(struct tegra_soctherm *ts,
 			     const struct tegra_tsensor_group *sg,
 			     struct soctherm_throt_cfg *stc,
 			     int trip_temp)
 {
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	int temp, cpu_throt, gpu_throt;
 	unsigned int throt;
 	u32 r, reg_off;
@@ -523,7 +522,7 @@ static int throttrip_program(struct device *dev,
 	if (!sg || !stc || !stc->init)
 		return -EINVAL;
 
-	temp = enforce_temp_range(dev, trip_temp) / ts->soc->thresh_grain;
+	temp = enforce_temp_range(ts, trip_temp) / ts->soc->thresh_grain;
 
 	/* Hardcode LIGHT on LEVEL1 and HEAVY on LEVEL2 */
 	throt = stc->id;
@@ -536,7 +535,7 @@ static int throttrip_program(struct device *dev,
 		cpu_throt = THERMCTL_LVL0_CPU0_CPU_THROT_HEAVY;
 		gpu_throt = THERMCTL_LVL0_CPU0_GPU_THROT_HEAVY;
 		if (throt != THROTTLE_HEAVY)
-			dev_warn(dev,
+			dev_warn(ts->dev,
 				 "invalid throt id %d - assuming HEAVY",
 				 throt);
 	}
@@ -588,7 +587,6 @@ static int tegra_thermctl_set_trip_temp(struct thermal_zone_device *tz, int trip
 	struct tegra_soctherm *ts = zone->ts;
 	struct thermal_trip trip;
 	const struct tegra_tsensor_group *sg = zone->sg;
-	struct device *dev = zone->dev;
 	int ret;
 
 	if (!tz)
@@ -605,7 +603,7 @@ static int tegra_thermctl_set_trip_temp(struct thermal_zone_device *tz, int trip
 		 * if not, program critical trip to HW.
 		 */
 		if (min_low_temp == tsensor_group_thermtrip_get(ts, sg->id))
-			return thermtrip_program(dev, sg, temp);
+			return thermtrip_program(ts, sg, temp);
 		else
 			return 0;
 
@@ -625,7 +623,7 @@ static int tegra_thermctl_set_trip_temp(struct thermal_zone_device *tz, int trip
 			else
 				continue;
 
-			return throttrip_program(dev, sg, stc, temp);
+			return throttrip_program(ts, sg, stc, temp);
 		}
 	}
 
@@ -667,9 +665,9 @@ static int tegra_thermctl_set_trips(struct thermal_zone_device *tz, int lo, int
 	r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 0);
 	writel(r, zone->ts->regs + zone->sg->thermctl_lvl0_offset);
 
-	lo = enforce_temp_range(zone->dev, lo) / zone->ts->soc->thresh_grain;
-	hi = enforce_temp_range(zone->dev, hi) / zone->ts->soc->thresh_grain;
-	dev_dbg(zone->dev, "%s hi:%d, lo:%d\n", __func__, hi, lo);
+	lo = enforce_temp_range(zone->ts, lo) / zone->ts->soc->thresh_grain;
+	hi = enforce_temp_range(zone->ts, hi) / zone->ts->soc->thresh_grain;
+	dev_dbg(zone->ts->dev, "%s hi:%d, lo:%d\n", __func__, hi, lo);
 
 	r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_up_thresh_mask, hi);
 	r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_dn_thresh_mask, lo);
@@ -731,11 +729,10 @@ static int get_hot_temp(struct thermal_zone_device *tz, int *trip_id, int *temp)
  * this one appears on the serial console:
  * ""throttrip: will throttle when sensor group XXX reaches YYYYYY mC"
  */
-static int tegra_soctherm_set_hwtrips(struct device *dev,
+static int tegra_soctherm_set_hwtrips(struct tegra_soctherm *ts,
 				      const struct tegra_tsensor_group *sg,
 				      struct thermal_zone_device *tz)
 {
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	struct soctherm_throt_cfg *stc;
 	int i, trip, temperature, ret;
 
@@ -745,18 +742,19 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 		if (thermal_zone_get_crit_temp(tz, &temperature))
 			temperature = max_high_temp;
 
-	ret = thermtrip_program(dev, sg, temperature);
+	ret = thermtrip_program(ts, sg, temperature);
 	if (ret) {
-		dev_err(dev, "thermtrip: %s: error during enable\n", sg->name);
+		dev_err(ts->dev, "thermtrip: %s: error during enable\n",
+			sg->name);
 		return ret;
 	}
 
-	dev_info(dev, "thermtrip: will shut down when %s reaches %d mC\n",
+	dev_info(ts->dev, "thermtrip: will shut down when %s reaches %d mC\n",
 		 sg->name, temperature);
 
 	ret = get_hot_temp(tz, &trip, &temperature);
 	if (ret) {
-		dev_info(dev, "throttrip: %s: missing hot temperature\n",
+		dev_info(ts->dev, "throttrip: %s: missing hot temperature\n",
 			 sg->name);
 		return 0;
 	}
@@ -773,21 +771,21 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 		else
 			continue;
 
-		ret = throttrip_program(dev, sg, stc, temperature);
+		ret = throttrip_program(ts, sg, stc, temperature);
 		if (ret) {
-			dev_err(dev, "throttrip: %s: error during enable\n",
+			dev_err(ts->dev, "throttrip: %s: error during enable\n",
 				sg->name);
 			return ret;
 		}
 
-		dev_info(dev,
+		dev_info(ts->dev,
 			 "throttrip: will throttle when %s reaches %d mC\n",
 			 sg->name, temperature);
 		break;
 	}
 
 	if (i == THROTTLE_SIZE)
-		dev_info(dev, "throttrip: %s: missing throttle cdev\n",
+		dev_info(ts->dev, "throttrip: %s: missing throttle cdev\n",
 			 sg->name);
 
 	return 0;
@@ -1253,8 +1251,7 @@ static int soctherm_oc_int_init(struct device_node *np, int num_irqs)
 #ifdef CONFIG_DEBUG_FS
 static int regs_show(struct seq_file *s, void *data)
 {
-	struct platform_device *pdev = s->private;
-	struct tegra_soctherm *ts = platform_get_drvdata(pdev);
+	struct tegra_soctherm *ts = s->private;
 	const struct tegra_tsensor *tsensors = ts->soc->tsensors;
 	const struct tegra_tsensor_group **ttgs = ts->soc->ttgs;
 	u32 r, state;
@@ -1449,24 +1446,24 @@ static int regs_show(struct seq_file *s, void *data)
 
 DEFINE_SHOW_ATTRIBUTE(regs);
 
-static void soctherm_debug_init(struct platform_device *pdev)
+static void soctherm_debug_init(struct tegra_soctherm *tegra)
 {
-	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
 	struct dentry *root;
 
 	root = debugfs_create_dir("soctherm", NULL);
 
 	tegra->debugfs_dir = root;
 
-	debugfs_create_file("reg_contents", 0644, root, pdev, &regs_fops);
+	debugfs_create_file("reg_contents", 0644, root, tegra, &regs_fops);
 }
 #else
-static inline void soctherm_debug_init(struct platform_device *pdev) {}
+static inline void soctherm_debug_init(struct tegra_soctherm *ts)
+{
+}
 #endif
 
-static int soctherm_clk_enable(struct platform_device *pdev, bool enable)
+static int soctherm_clk_enable(struct tegra_soctherm *tegra, bool enable)
 {
-	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
 	int err;
 
 	if (!tegra->clock_soctherm || !tegra->clock_tsensor)
@@ -1531,10 +1528,8 @@ static const struct thermal_cooling_device_ops throt_cooling_ops = {
 	.set_cur_state = throt_set_cdev_state,
 };
 
-static int soctherm_thermtrips_parse(struct platform_device *pdev)
+static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 {
-	struct device *dev = &pdev->dev;
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
 	const int max_num_prop = ts->soc->num_ttgs * 2;
 	u32 *tlb;
@@ -1543,22 +1538,22 @@ static int soctherm_thermtrips_parse(struct platform_device *pdev)
 	if (!tt)
 		return -ENOMEM;
 
-	n = of_property_count_u32_elems(dev->of_node, "nvidia,thermtrips");
+	n = of_property_count_u32_elems(ts->dev->of_node, "nvidia,thermtrips");
 	if (n <= 0) {
-		dev_info(dev,
+		dev_info(ts->dev,
 			 "missing thermtrips, will use critical trips as shut down temp\n");
 		return n;
 	}
 
 	n = min(max_num_prop, n);
 
-	tlb = devm_kcalloc(&pdev->dev, max_num_prop, sizeof(u32), GFP_KERNEL);
+	tlb = devm_kcalloc(ts->dev, max_num_prop, sizeof(u32), GFP_KERNEL);
 	if (!tlb)
 		return -ENOMEM;
-	ret = of_property_read_u32_array(dev->of_node, "nvidia,thermtrips",
+	ret = of_property_read_u32_array(ts->dev->of_node, "nvidia,thermtrips",
 					 tlb, n);
 	if (ret) {
-		dev_err(dev, "invalid num ele: thermtrips:%d\n", ret);
+		dev_err(ts->dev, "invalid num ele: thermtrips:%d\n", ret);
 		return ret;
 	}
 
@@ -1575,9 +1570,9 @@ static int soctherm_thermtrips_parse(struct platform_device *pdev)
 	return 0;
 }
 
-static void soctherm_oc_cfg_parse(struct device *dev,
-				struct device_node *np_oc,
-				struct soctherm_throt_cfg *stc)
+static void soctherm_oc_cfg_parse(struct tegra_soctherm *tegra,
+				  struct device_node *np_oc,
+				  struct soctherm_throt_cfg *stc)
 {
 	u32 val;
 
@@ -1601,19 +1596,20 @@ static void soctherm_oc_cfg_parse(struct device *dev,
 	stc->oc_cfg.mode = OC_THROTTLE_MODE_BRIEF;
 }
 
-static int soctherm_throt_cfg_parse(struct device *dev,
+static int soctherm_throt_cfg_parse(struct tegra_soctherm *ts,
 				    struct device_node *np,
 				    struct soctherm_throt_cfg *stc)
 {
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	int ret;
 	u32 val;
 
 	ret = of_property_read_u32(np, "nvidia,priority", &val);
 	if (ret) {
-		dev_err(dev, "throttle-cfg: %s: invalid priority\n", stc->name);
+		dev_err(ts->dev, "throttle-cfg: %s: invalid priority\n",
+			stc->name);
 		return -EINVAL;
 	}
+
 	stc->priority = val;
 
 	ret = of_property_read_u32(np, ts->soc->use_ccroc ?
@@ -1640,7 +1636,7 @@ static int soctherm_throt_cfg_parse(struct device *dev,
 	return 0;
 
 err:
-	dev_err(dev, "throttle-cfg: %s: no throt prop or invalid prop\n",
+	dev_err(ts->dev, "throttle-cfg: %s: no throt prop or invalid prop\n",
 		stc->name);
 	return -EINVAL;
 }
@@ -1648,25 +1644,23 @@ static int soctherm_throt_cfg_parse(struct device *dev,
 /**
  * soctherm_init_hw_throt_cdev() - Parse the HW throttle configurations
  * and register them as cooling devices.
- * @pdev: Pointer to platform_device struct
+ * @tegra: pointer to Tegra soctherm structure
  */
-static void soctherm_init_hw_throt_cdev(struct platform_device *pdev)
+static void soctherm_init_hw_throt_cdev(struct tegra_soctherm *tegra)
 {
-	struct device *dev = &pdev->dev;
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	struct device_node *np_stc, *np_stcc;
 	const char *name;
 	int i;
 
 	for (i = 0; i < THROTTLE_SIZE; i++) {
-		ts->throt_cfgs[i].name = throt_names[i];
-		ts->throt_cfgs[i].id = i;
-		ts->throt_cfgs[i].init = false;
+		tegra->throt_cfgs[i].name = throt_names[i];
+		tegra->throt_cfgs[i].id = i;
+		tegra->throt_cfgs[i].init = false;
 	}
 
-	np_stc = of_get_child_by_name(dev->of_node, "throttle-cfgs");
+	np_stc = of_get_child_by_name(tegra->dev->of_node, "throttle-cfgs");
 	if (!np_stc) {
-		dev_info(dev,
+		dev_info(tegra->dev,
 			 "throttle-cfg: no throttle-cfgs - not enabling\n");
 		return;
 	}
@@ -1677,33 +1671,34 @@ static void soctherm_init_hw_throt_cdev(struct platform_device *pdev)
 		int err;
 
 		name = np_stcc->name;
-		stc = find_throttle_cfg_by_name(ts, name);
+		stc = find_throttle_cfg_by_name(tegra, name);
 		if (!stc) {
-			dev_err(dev,
-				"throttle-cfg: could not find %s\n", name);
+			dev_err(tegra->dev, "throttle-cfg: could not find %s\n",
+				name);
 			continue;
 		}
 
 		if (stc->init) {
-			dev_err(dev, "throttle-cfg: %s: redefined!\n", name);
+			dev_err(tegra->dev, "throttle-cfg: %s: redefined!\n",
+				name);
 			of_node_put(np_stcc);
 			break;
 		}
 
-		err = soctherm_throt_cfg_parse(dev, np_stcc, stc);
+		err = soctherm_throt_cfg_parse(tegra, np_stcc, stc);
 		if (err)
 			continue;
 
 		if (stc->id >= THROTTLE_OC1) {
-			soctherm_oc_cfg_parse(dev, np_stcc, stc);
+			soctherm_oc_cfg_parse(tegra, np_stcc, stc);
 			stc->init = true;
 		} else {
 
 			tcd = thermal_of_cooling_device_register(np_stcc,
-							 (char *)name, ts,
+							 (char *)name, tegra,
 							 &throt_cooling_ops);
 			if (IS_ERR_OR_NULL(tcd)) {
-				dev_err(dev,
+				dev_err(tegra->dev,
 					"throttle-cfg: %s: failed to register cooling device\n",
 					name);
 				continue;
@@ -1711,7 +1706,6 @@ static void soctherm_init_hw_throt_cdev(struct platform_device *pdev)
 			stc->cdev = tcd;
 			stc->init = true;
 		}
-
 	}
 
 	of_node_put(np_stc);
@@ -1931,9 +1925,8 @@ static void soctherm_throttle_program(struct tegra_soctherm *ts,
 	writel(r, ts->regs + THROT_PRIORITY_LOCK);
 }
 
-static void tegra_soctherm_throttle(struct device *dev)
+static void tegra_soctherm_throttle(struct tegra_soctherm *ts)
 {
-	struct tegra_soctherm *ts = dev_get_drvdata(dev);
 	u32 v;
 	int i;
 
@@ -1969,43 +1962,29 @@ static void tegra_soctherm_throttle(struct device *dev)
 	writel(v, ts->regs + THERMCTL_STATS_CTL);
 }
 
-static int soctherm_interrupts_init(struct platform_device *pdev,
-				    struct tegra_soctherm *tegra)
+static int soctherm_interrupts_init(struct tegra_soctherm *tegra)
 {
-	struct device_node *np = pdev->dev.of_node;
 	int ret;
 
-	ret = soctherm_oc_int_init(np, TEGRA_SOC_OC_IRQ_MAX);
+	ret = soctherm_oc_int_init(tegra->dev->of_node, TEGRA_SOC_OC_IRQ_MAX);
 	if (ret < 0) {
-		dev_err(&pdev->dev, "soctherm_oc_int_init failed\n");
+		dev_err(tegra->dev, "soctherm_oc_int_init failed\n");
 		return ret;
 	}
 
-	tegra->thermal_irq = platform_get_irq(pdev, 0);
-	if (tegra->thermal_irq < 0) {
-		dev_dbg(&pdev->dev, "get 'thermal_irq' failed.\n");
-		return 0;
-	}
-
-	tegra->edp_irq = platform_get_irq(pdev, 1);
-	if (tegra->edp_irq < 0) {
-		dev_dbg(&pdev->dev, "get 'edp_irq' failed.\n");
-		return 0;
-	}
-
-	ret = devm_request_threaded_irq(&pdev->dev,
+	ret = devm_request_threaded_irq(tegra->dev,
 					tegra->thermal_irq,
 					soctherm_thermal_isr,
 					soctherm_thermal_isr_thread,
 					IRQF_ONESHOT,
-					dev_name(&pdev->dev),
+					dev_name(tegra->dev),
 					tegra);
 	if (ret < 0) {
-		dev_err(&pdev->dev, "request_irq 'thermal_irq' failed.\n");
+		dev_err(tegra->dev, "failed to request thermal IRQ: %d\n", ret);
 		return ret;
 	}
 
-	ret = devm_request_threaded_irq(&pdev->dev,
+	ret = devm_request_threaded_irq(tegra->dev,
 					tegra->edp_irq,
 					soctherm_edp_isr,
 					soctherm_edp_isr_thread,
@@ -2013,16 +1992,15 @@ static int soctherm_interrupts_init(struct platform_device *pdev,
 					"soctherm_edp",
 					tegra);
 	if (ret < 0) {
-		dev_err(&pdev->dev, "request_irq 'edp_irq' failed.\n");
+		dev_err(tegra->dev, "failed to request EDP IRQ: %d\n", ret);
 		return ret;
 	}
 
 	return 0;
 }
 
-static void soctherm_init(struct platform_device *pdev)
+static void soctherm_init(struct tegra_soctherm *tegra)
 {
-	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
 	const struct tegra_tsensor_group **ttgs = tegra->soc->ttgs;
 	int i;
 	u32 pdiv, hotspot;
@@ -2048,7 +2026,7 @@ static void soctherm_init(struct platform_device *pdev)
 	writel(hotspot, tegra->regs + SENSOR_HOTSPOT_OFF);
 
 	/* Configure hw throttle */
-	tegra_soctherm_throttle(&pdev->dev);
+	tegra_soctherm_throttle(tegra);
 }
 
 static const struct of_device_id tegra_soctherm_of_match[] = {
@@ -2096,9 +2074,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 	if (!tegra)
 		return -ENOMEM;
 
-	mutex_init(&tegra->thermctl_lock);
 	dev_set_drvdata(&pdev->dev, tegra);
-
+	mutex_init(&tegra->thermctl_lock);
+	tegra->dev = &pdev->dev;
 	tegra->soc = soc;
 
 	tegra->regs = devm_platform_ioremap_resource_byname(pdev, "soctherm-reg");
@@ -2121,6 +2099,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 		}
 	}
 
+	tegra->thermal_irq = platform_get_irq(pdev, 0);
+	tegra->edp_irq = platform_get_irq(pdev, 1);
+
 	tegra->reset = devm_reset_control_get(&pdev->dev, "soctherm");
 	if (IS_ERR(tegra->reset)) {
 		dev_err(&pdev->dev, "can't get soctherm reset\n");
@@ -2165,15 +2146,15 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 	if (!tegra->thermctl_tzs)
 		return -ENOMEM;
 
-	err = soctherm_clk_enable(pdev, true);
+	err = soctherm_clk_enable(tegra, true);
 	if (err)
 		return err;
 
-	soctherm_thermtrips_parse(pdev);
+	soctherm_thermtrips_parse(tegra);
 
-	soctherm_init_hw_throt_cdev(pdev);
+	soctherm_init_hw_throt_cdev(tegra);
 
-	soctherm_init(pdev);
+	soctherm_init(tegra);
 
 	for (i = 0; i < soc->num_ttgs; ++i) {
 		struct tegra_thermctl_zone *zone =
@@ -2184,7 +2165,6 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 		}
 
 		zone->reg = tegra->regs + soc->ttgs[i]->sensor_temp_offset;
-		zone->dev = &pdev->dev;
 		zone->sg = soc->ttgs[i];
 		zone->ts = tegra;
 
@@ -2202,19 +2182,19 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 		tegra->thermctl_tzs[soc->ttgs[i]->id] = z;
 
 		/* Configure hw trip points */
-		err = tegra_soctherm_set_hwtrips(&pdev->dev, soc->ttgs[i], z);
+		err = tegra_soctherm_set_hwtrips(tegra, soc->ttgs[i], z);
 		if (err)
 			goto disable_clocks;
 	}
 
-	err = soctherm_interrupts_init(pdev, tegra);
+	err = soctherm_interrupts_init(tegra);
 
-	soctherm_debug_init(pdev);
+	soctherm_debug_init(tegra);
 
 	return 0;
 
 disable_clocks:
-	soctherm_clk_enable(pdev, false);
+	soctherm_clk_enable(tegra, false);
 
 	return err;
 }
@@ -2225,42 +2205,39 @@ static void tegra_soctherm_remove(struct platform_device *pdev)
 
 	debugfs_remove_recursive(tegra->debugfs_dir);
 
-	soctherm_clk_enable(pdev, false);
+	soctherm_clk_enable(tegra, false);
 }
 
 static int __maybe_unused soctherm_suspend(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
+	struct tegra_soctherm *tegra = dev_get_drvdata(dev);
 
-	soctherm_clk_enable(pdev, false);
+	soctherm_clk_enable(tegra, false);
 
 	return 0;
 }
 
 static int __maybe_unused soctherm_resume(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
+	struct tegra_soctherm *tegra = dev_get_drvdata(dev);
 	struct tegra_soctherm_soc *soc = tegra->soc;
 	int err, i;
 
-	err = soctherm_clk_enable(pdev, true);
+	err = soctherm_clk_enable(tegra, true);
 	if (err) {
-		dev_err(&pdev->dev,
-			"Resume failed: enable clocks failed\n");
+		dev_err(dev, "Resume failed: enable clocks failed\n");
 		return err;
 	}
 
-	soctherm_init(pdev);
+	soctherm_init(tegra);
 
 	for (i = 0; i < soc->num_ttgs; ++i) {
 		struct thermal_zone_device *tz;
 
 		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
-		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
+		err = tegra_soctherm_set_hwtrips(tegra, soc->ttgs[i], tz);
 		if (err) {
-			dev_err(&pdev->dev,
-				"Resume failed: set hwtrips failed\n");
+			dev_err(dev, "Resume failed: set hwtrips failed\n");
 			return err;
 		}
 	}
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 05/13] thermal: tegra: Constify SoC-specific data
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (3 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 06/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The SoC-specific parameter structure is read-only data, so consistently
use const to mark it as such.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index f434e141dcdf..11c23ace2c81 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -339,7 +339,7 @@ struct tegra_soctherm {
 
 	u32 *calib;
 	struct thermal_zone_device **thermctl_tzs;
-	struct tegra_soctherm_soc *soc;
+	const struct tegra_soctherm_soc *soc;
 
 	struct soctherm_throt_cfg throt_cfgs[THROTTLE_SIZE];
 
@@ -2054,19 +2054,15 @@ MODULE_DEVICE_TABLE(of, tegra_soctherm_of_match);
 
 static int tegra_soctherm_probe(struct platform_device *pdev)
 {
-	const struct of_device_id *match;
 	struct tegra_soctherm *tegra;
 	struct thermal_zone_device *z;
 	struct tsensor_shared_calib shared_calib;
-	struct tegra_soctherm_soc *soc;
+	const struct tegra_soctherm_soc *soc;
 	unsigned int i;
 	int err;
 
-	match = of_match_node(tegra_soctherm_of_match, pdev->dev.of_node);
-	if (!match)
-		return -ENODEV;
+	soc = device_get_match_data(&pdev->dev);
 
-	soc = (struct tegra_soctherm_soc *)match->data;
 	if (soc->num_ttgs > TEGRA124_SOCTHERM_SENSOR_NUM)
 		return -EINVAL;
 
@@ -2220,7 +2216,7 @@ static int __maybe_unused soctherm_suspend(struct device *dev)
 static int __maybe_unused soctherm_resume(struct device *dev)
 {
 	struct tegra_soctherm *tegra = dev_get_drvdata(dev);
-	struct tegra_soctherm_soc *soc = tegra->soc;
+	const struct tegra_soctherm_soc *soc = tegra->soc;
 	int err, i;
 
 	err = soctherm_clk_enable(tegra, true);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 06/13] thermal: tegra: Do not register cooling device
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (4 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 05/13] thermal: tegra: Constify SoC-specific data Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-13 15:57   ` Daniel Lezcano
  2023-10-12 17:58 ` [PATCH v2 07/13] thermal: tegra: Use unsigned int where appropriate Thierry Reding
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The SOCTHERM's built-in throttling mechanism doesn't map well to the
concept of a cooling device because it will automatically start to
throttle when the programmed temperature threshold is crossed.

Remove the cooling device implementation and instead unconditionally
program the throttling for the CPU and GPU thermal zones.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- adapt to new DT bindings

 drivers/thermal/tegra/soctherm.c          | 287 +++++++++-------------
 drivers/thermal/tegra/soctherm.h          |   1 +
 drivers/thermal/tegra/tegra124-soctherm.c |   4 +
 drivers/thermal/tegra/tegra132-soctherm.c |   4 +
 drivers/thermal/tegra/tegra210-soctherm.c |   4 +
 5 files changed, 132 insertions(+), 168 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 11c23ace2c81..105ec20d509d 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -33,7 +33,6 @@
 
 #include <dt-bindings/thermal/tegra124-soctherm.h>
 
-#include "../thermal_core.h"
 #include "soctherm.h"
 
 #define SENSOR_CONFIG0				0
@@ -317,12 +316,16 @@ struct soctherm_throt_cfg {
 	const char *name;
 	unsigned int id;
 	u8 priority;
+	int temperature;
+	int hysteresis;
 	u8 cpu_throt_level;
 	u32 cpu_throt_depth;
 	u32 gpu_throt_level;
 	struct soctherm_oc_cfg oc_cfg;
-	struct thermal_cooling_device *cdev;
 	bool init;
+
+	unsigned int num_zones;
+	struct device_node **zones;
 };
 
 struct tegra_soctherm {
@@ -497,9 +500,7 @@ static int thermtrip_program(struct tegra_soctherm *ts,
  * throttrip_program() - Configures the hardware to throttle the
  * pulse if a given sensor group reaches a given temperature
  * @ts: pointer to a struct tegra_soctherm
- * @sg: pointer to the sensor group to set the thermtrip temperature for
  * @stc: pointer to the throttle need to be triggered
- * @trip_temp: the temperature in millicelsius to trigger the thermal trip at
  *
  * Sets the thermal trip threshold and throttle event of the given sensor
  * group. If this threshold is crossed, the hardware will trigger the
@@ -510,43 +511,68 @@ static int thermtrip_program(struct tegra_soctherm *ts,
  *
  * Return: 0 upon success, or %-EINVAL upon failure.
  */
-static int throttrip_program(struct tegra_soctherm *ts,
-			     const struct tegra_tsensor_group *sg,
-			     struct soctherm_throt_cfg *stc,
-			     int trip_temp)
+static int throttrip_program(struct tegra_soctherm *tegra,
+			     struct soctherm_throt_cfg *stc)
 {
-	int temp, cpu_throt, gpu_throt;
-	unsigned int throt;
-	u32 r, reg_off;
+	int high, low, cpu_throt, gpu_throt;
+	unsigned int throt, i, j;
 
-	if (!sg || !stc || !stc->init)
-		return -EINVAL;
+	high = enforce_temp_range(tegra, stc->temperature);
+	high /= tegra->soc->thresh_grain;
+	low = enforce_temp_range(tegra, stc->temperature - stc->hysteresis);
+	low /= tegra->soc->thresh_grain;
 
-	temp = enforce_temp_range(ts, trip_temp) / ts->soc->thresh_grain;
+	for (i = 0; i < stc->num_zones; i++) {
+		struct tegra_thermctl_zone *zone = NULL;
+		unsigned int offset;
+		u32 r;
 
-	/* Hardcode LIGHT on LEVEL1 and HEAVY on LEVEL2 */
-	throt = stc->id;
-	reg_off = THERMCTL_LVL_REG(sg->thermctl_lvl0_offset, throt + 1);
+		for (j = 0; j < tegra->soc->num_ttgs; j++) {
+			struct thermal_zone_device *tz = tegra->thermctl_tzs[j];
 
-	if (throt == THROTTLE_LIGHT) {
-		cpu_throt = THERMCTL_LVL0_CPU0_CPU_THROT_LIGHT;
-		gpu_throt = THERMCTL_LVL0_CPU0_GPU_THROT_LIGHT;
-	} else {
-		cpu_throt = THERMCTL_LVL0_CPU0_CPU_THROT_HEAVY;
-		gpu_throt = THERMCTL_LVL0_CPU0_GPU_THROT_HEAVY;
-		if (throt != THROTTLE_HEAVY)
-			dev_warn(ts->dev,
-				 "invalid throt id %d - assuming HEAVY",
-				 throt);
-	}
+			if (tz->device.of_node == stc->zones[i]) {
+				zone = thermal_zone_device_priv(tz);
+				break;
+			}
+		}
 
-	r = readl(ts->regs + reg_off);
-	r = REG_SET_MASK(r, sg->thermctl_lvl0_up_thresh_mask, temp);
-	r = REG_SET_MASK(r, sg->thermctl_lvl0_dn_thresh_mask, temp);
-	r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_CPU_THROT_MASK, cpu_throt);
-	r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_GPU_THROT_MASK, gpu_throt);
-	r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 1);
-	writel(r, ts->regs + reg_off);
+		if (!zone)
+			continue;
+
+		if (!zone->sg->can_throttle) {
+			dev_warn(tegra->dev, "zone %s cannot be throttled\n",
+				 zone->sg->name);
+			continue;
+		}
+
+		/* Hardcode LIGHT on LEVEL1 and HEAVY on LEVEL2 */
+		throt = stc->id;
+		offset = THERMCTL_LVL_REG(zone->sg->thermctl_lvl0_offset, throt + 1);
+
+		if (throt == THROTTLE_LIGHT) {
+			cpu_throt = THERMCTL_LVL0_CPU0_CPU_THROT_LIGHT;
+			gpu_throt = THERMCTL_LVL0_CPU0_GPU_THROT_LIGHT;
+		} else {
+			cpu_throt = THERMCTL_LVL0_CPU0_CPU_THROT_HEAVY;
+			gpu_throt = THERMCTL_LVL0_CPU0_GPU_THROT_HEAVY;
+			if (throt != THROTTLE_HEAVY)
+				dev_warn(tegra->dev,
+					 "invalid throt id %d - assuming HEAVY",
+					 throt);
+		}
+
+		r = readl(tegra->regs + offset);
+		r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_up_thresh_mask, high);
+		r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_dn_thresh_mask, low);
+		r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_CPU_THROT_MASK, cpu_throt);
+		r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_GPU_THROT_MASK, gpu_throt);
+		r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 1);
+		writel(r, tegra->regs + offset);
+
+		dev_info(tegra->dev,
+			 "throttrip: will throttle when %s reaches %d mC\n",
+			 zone->sg->name, stc->temperature);
+	}
 
 	return 0;
 }
@@ -606,25 +632,6 @@ static int tegra_thermctl_set_trip_temp(struct thermal_zone_device *tz, int trip
 			return thermtrip_program(ts, sg, temp);
 		else
 			return 0;
-
-	} else if (trip.type == THERMAL_TRIP_HOT) {
-		int i;
-
-		for (i = 0; i < THROTTLE_SIZE; i++) {
-			struct thermal_cooling_device *cdev;
-			struct soctherm_throt_cfg *stc;
-
-			if (!ts->throt_cfgs[i].init)
-				continue;
-
-			cdev = ts->throt_cfgs[i].cdev;
-			if (get_thermal_instance(tz, cdev, trip_id))
-				stc = find_throttle_cfg_by_name(ts, cdev->type);
-			else
-				continue;
-
-			return throttrip_program(ts, sg, stc, temp);
-		}
 	}
 
 	return 0;
@@ -685,29 +692,9 @@ static const struct thermal_zone_device_ops tegra_of_thermal_ops = {
 	.set_trips = tegra_thermctl_set_trips,
 };
 
-static int get_hot_temp(struct thermal_zone_device *tz, int *trip_id, int *temp)
-{
-	int i, ret;
-	struct thermal_trip trip;
-
-	for (i = 0; i < thermal_zone_get_num_trips(tz); i++) {
-
-		ret = thermal_zone_get_trip(tz, i, &trip);
-		if (ret)
-			return -EINVAL;
-
-		if (trip.type == THERMAL_TRIP_HOT) {
-			*trip_id = i;
-			return 0;
-		}
-	}
-
-	return -EINVAL;
-}
-
 /**
  * tegra_soctherm_set_hwtrips() - set HW trip point from DT data
- * @dev: struct device * of the SOC_THERM instance
+ * @ts: pointer to a struct tegra_soctherm
  * @sg: pointer to the sensor group to set the thermtrip temperature for
  * @tz: struct thermal_zone_device *
  *
@@ -725,16 +712,12 @@ static int get_hot_temp(struct thermal_zone_device *tz, int *trip_id, int *temp)
  * THERMTRIP has been enabled successfully when a message similar to
  * this one appears on the serial console:
  * "thermtrip: will shut down when sensor group XXX reaches YYYYYY mC"
- * THROTTLE has been enabled successfully when a message similar to
- * this one appears on the serial console:
- * ""throttrip: will throttle when sensor group XXX reaches YYYYYY mC"
  */
 static int tegra_soctherm_set_hwtrips(struct tegra_soctherm *ts,
 				      const struct tegra_tsensor_group *sg,
 				      struct thermal_zone_device *tz)
 {
-	struct soctherm_throt_cfg *stc;
-	int i, trip, temperature, ret;
+	int temperature, ret;
 
 	/* Get thermtrips. If missing, try to get critical trips. */
 	temperature = tsensor_group_thermtrip_get(ts, sg->id);
@@ -752,43 +735,31 @@ static int tegra_soctherm_set_hwtrips(struct tegra_soctherm *ts,
 	dev_info(ts->dev, "thermtrip: will shut down when %s reaches %d mC\n",
 		 sg->name, temperature);
 
-	ret = get_hot_temp(tz, &trip, &temperature);
-	if (ret) {
-		dev_info(ts->dev, "throttrip: %s: missing hot temperature\n",
-			 sg->name);
-		return 0;
-	}
-
-	for (i = 0; i < THROTTLE_OC1; i++) {
-		struct thermal_cooling_device *cdev;
-
-		if (!ts->throt_cfgs[i].init)
-			continue;
-
-		cdev = ts->throt_cfgs[i].cdev;
-		if (get_thermal_instance(tz, cdev, trip))
-			stc = find_throttle_cfg_by_name(ts, cdev->type);
-		else
-			continue;
-
-		ret = throttrip_program(ts, sg, stc, temperature);
-		if (ret) {
-			dev_err(ts->dev, "throttrip: %s: error during enable\n",
-				sg->name);
-			return ret;
-		}
+	return 0;
+}
 
-		dev_info(ts->dev,
-			 "throttrip: will throttle when %s reaches %d mC\n",
-			 sg->name, temperature);
-		break;
-	}
+/*
+ * soctherm_enable_hw_throttling() - enable hardware throttling trip points
+ * @tegra: pointer to a struct tegra_soctherm
+ *
+ * THROTTLE has been enabled successfully when a message similar to
+ * this one appears on the serial console:
+ * ""throttrip: will throttle when sensor group XXX reaches YYYYYY mC"
+ */
+static void soctherm_enable_hw_throttling(struct tegra_soctherm *tegra)
+{
+	struct soctherm_throt_cfg *stc;
+	int err;
 
-	if (i == THROTTLE_SIZE)
-		dev_info(ts->dev, "throttrip: %s: missing throttle cdev\n",
-			 sg->name);
+	/* if configured, program the pulse skipper for CPU and GPU zones */
+	stc = find_throttle_cfg_by_name(tegra, "heavy");
+	if (!stc)
+		return;
 
-	return 0;
+	err = throttrip_program(tegra, stc);
+	if (err)
+		dev_err(tegra->dev, "throttrip: %s: failed to enable: %d\n",
+			stc->name, err);
 }
 
 static irqreturn_t soctherm_thermal_isr(int irq, void *dev_id)
@@ -1494,40 +1465,6 @@ static int soctherm_clk_enable(struct tegra_soctherm *tegra, bool enable)
 	return 0;
 }
 
-static int throt_get_cdev_max_state(struct thermal_cooling_device *cdev,
-				    unsigned long *max_state)
-{
-	*max_state = 1;
-	return 0;
-}
-
-static int throt_get_cdev_cur_state(struct thermal_cooling_device *cdev,
-				    unsigned long *cur_state)
-{
-	struct tegra_soctherm *ts = cdev->devdata;
-	u32 r;
-
-	r = readl(ts->regs + THROT_STATUS);
-	if (REG_GET_MASK(r, THROT_STATUS_STATE_MASK))
-		*cur_state = 1;
-	else
-		*cur_state = 0;
-
-	return 0;
-}
-
-static int throt_set_cdev_state(struct thermal_cooling_device *cdev,
-				unsigned long cur_state)
-{
-	return 0;
-}
-
-static const struct thermal_cooling_device_ops throt_cooling_ops = {
-	.get_max_state = throt_get_cdev_max_state,
-	.get_cur_state = throt_get_cdev_cur_state,
-	.set_cur_state = throt_set_cdev_state,
-};
-
 static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 {
 	struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
@@ -1633,6 +1570,33 @@ static int soctherm_throt_cfg_parse(struct tegra_soctherm *ts,
 	else
 		goto err;
 
+	ret = of_property_read_u32(np, "temperature-millicelsius",
+				   &stc->temperature);
+	if (ret < 0)
+		goto err;
+
+	ret = of_property_read_u32(np, "hysteresis-millicelsius",
+				   &stc->hysteresis);
+	if (ret < 0)
+		goto err;
+
+	stc->num_zones = of_count_phandle_with_args(np, "nvidia,thermal-zones",
+						    NULL);
+	if (stc->num_zones > 0) {
+		struct device_node *zone;
+		unsigned int i;
+
+		stc->zones = devm_kcalloc(ts->dev, stc->num_zones, sizeof(zone),
+					  GFP_KERNEL);
+		if (!stc->zones)
+			return -ENOMEM;
+
+		for (i = 0; i < stc->num_zones; i++) {
+			zone = of_parse_phandle(np, "nvidia,thermal-zones", i);
+			stc->zones[i] = zone;
+		}
+	}
+
 	return 0;
 
 err:
@@ -1642,14 +1606,12 @@ static int soctherm_throt_cfg_parse(struct tegra_soctherm *ts,
 }
 
 /**
- * soctherm_init_hw_throt_cdev() - Parse the HW throttle configurations
- * and register them as cooling devices.
+ * soctherm_init_hw_throttling() - parse the HW throttle configurations
  * @tegra: pointer to Tegra soctherm structure
  */
-static void soctherm_init_hw_throt_cdev(struct tegra_soctherm *tegra)
+static void soctherm_init_hw_throttling(struct tegra_soctherm *tegra)
 {
 	struct device_node *np_stc, *np_stcc;
-	const char *name;
 	int i;
 
 	for (i = 0; i < THROTTLE_SIZE; i++) {
@@ -1666,11 +1628,10 @@ static void soctherm_init_hw_throt_cdev(struct tegra_soctherm *tegra)
 	}
 
 	for_each_child_of_node(np_stc, np_stcc) {
+		const char *name = np_stcc->name;
 		struct soctherm_throt_cfg *stc;
-		struct thermal_cooling_device *tcd;
 		int err;
 
-		name = np_stcc->name;
 		stc = find_throttle_cfg_by_name(tegra, name);
 		if (!stc) {
 			dev_err(tegra->dev, "throttle-cfg: could not find %s\n",
@@ -1692,19 +1653,6 @@ static void soctherm_init_hw_throt_cdev(struct tegra_soctherm *tegra)
 		if (stc->id >= THROTTLE_OC1) {
 			soctherm_oc_cfg_parse(tegra, np_stcc, stc);
 			stc->init = true;
-		} else {
-
-			tcd = thermal_of_cooling_device_register(np_stcc,
-							 (char *)name, tegra,
-							 &throt_cooling_ops);
-			if (IS_ERR_OR_NULL(tcd)) {
-				dev_err(tegra->dev,
-					"throttle-cfg: %s: failed to register cooling device\n",
-					name);
-				continue;
-			}
-			stc->cdev = tcd;
-			stc->init = true;
 		}
 	}
 
@@ -2148,7 +2096,7 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 
 	soctherm_thermtrips_parse(tegra);
 
-	soctherm_init_hw_throt_cdev(tegra);
+	soctherm_init_hw_throttling(tegra);
 
 	soctherm_init(tegra);
 
@@ -2183,6 +2131,7 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 			goto disable_clocks;
 	}
 
+	soctherm_enable_hw_throttling(tegra);
 	err = soctherm_interrupts_init(tegra);
 
 	soctherm_debug_init(tegra);
@@ -2238,6 +2187,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
 		}
 	}
 
+	soctherm_enable_hw_throttling(tegra);
+
 	return 0;
 }
 
diff --git a/drivers/thermal/tegra/soctherm.h b/drivers/thermal/tegra/soctherm.h
index 70501e73d586..894bee5d96c5 100644
--- a/drivers/thermal/tegra/soctherm.h
+++ b/drivers/thermal/tegra/soctherm.h
@@ -83,6 +83,7 @@ struct tegra_tsensor_group {
 	u16 thermctl_lvl0_offset;
 	u32 thermctl_lvl0_up_thresh_mask;
 	u32 thermctl_lvl0_dn_thresh_mask;
+	bool can_throttle;
 };
 
 struct tegra_tsensor_configuration {
diff --git a/drivers/thermal/tegra/tegra124-soctherm.c b/drivers/thermal/tegra/tegra124-soctherm.c
index 20ad27f4d1a1..7b11fa8fb533 100644
--- a/drivers/thermal/tegra/tegra124-soctherm.c
+++ b/drivers/thermal/tegra/tegra124-soctherm.c
@@ -60,6 +60,7 @@ static const struct tegra_tsensor_group tegra124_tsensor_group_cpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_CPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA124_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA124_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra124_tsensor_group_gpu = {
@@ -79,6 +80,7 @@ static const struct tegra_tsensor_group tegra124_tsensor_group_gpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_GPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA124_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA124_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra124_tsensor_group_pll = {
@@ -96,6 +98,7 @@ static const struct tegra_tsensor_group tegra124_tsensor_group_pll = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_TSENSE,
 	.thermctl_lvl0_up_thresh_mask = TEGRA124_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA124_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group tegra124_tsensor_group_mem = {
@@ -115,6 +118,7 @@ static const struct tegra_tsensor_group tegra124_tsensor_group_mem = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_MEM,
 	.thermctl_lvl0_up_thresh_mask = TEGRA124_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA124_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group *tegra124_tsensor_groups[] = {
diff --git a/drivers/thermal/tegra/tegra132-soctherm.c b/drivers/thermal/tegra/tegra132-soctherm.c
index b76308fdad9e..304561fe9110 100644
--- a/drivers/thermal/tegra/tegra132-soctherm.c
+++ b/drivers/thermal/tegra/tegra132-soctherm.c
@@ -60,6 +60,7 @@ static const struct tegra_tsensor_group tegra132_tsensor_group_cpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_CPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA132_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA132_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra132_tsensor_group_gpu = {
@@ -79,6 +80,7 @@ static const struct tegra_tsensor_group tegra132_tsensor_group_gpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_GPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA132_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA132_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra132_tsensor_group_pll = {
@@ -96,6 +98,7 @@ static const struct tegra_tsensor_group tegra132_tsensor_group_pll = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_TSENSE,
 	.thermctl_lvl0_up_thresh_mask = TEGRA132_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA132_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group tegra132_tsensor_group_mem = {
@@ -115,6 +118,7 @@ static const struct tegra_tsensor_group tegra132_tsensor_group_mem = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_MEM,
 	.thermctl_lvl0_up_thresh_mask = TEGRA132_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA132_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group *tegra132_tsensor_groups[] = {
diff --git a/drivers/thermal/tegra/tegra210-soctherm.c b/drivers/thermal/tegra/tegra210-soctherm.c
index d0ff793f18c5..6277a8e12b8a 100644
--- a/drivers/thermal/tegra/tegra210-soctherm.c
+++ b/drivers/thermal/tegra/tegra210-soctherm.c
@@ -61,6 +61,7 @@ static const struct tegra_tsensor_group tegra210_tsensor_group_cpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_CPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA210_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA210_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra210_tsensor_group_gpu = {
@@ -80,6 +81,7 @@ static const struct tegra_tsensor_group tegra210_tsensor_group_gpu = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_GPU,
 	.thermctl_lvl0_up_thresh_mask = TEGRA210_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA210_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = true,
 };
 
 static const struct tegra_tsensor_group tegra210_tsensor_group_pll = {
@@ -97,6 +99,7 @@ static const struct tegra_tsensor_group tegra210_tsensor_group_pll = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_TSENSE,
 	.thermctl_lvl0_up_thresh_mask = TEGRA210_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA210_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group tegra210_tsensor_group_mem = {
@@ -116,6 +119,7 @@ static const struct tegra_tsensor_group tegra210_tsensor_group_mem = {
 	.thermctl_lvl0_offset = THERMCTL_LEVEL0_GROUP_MEM,
 	.thermctl_lvl0_up_thresh_mask = TEGRA210_THERMCTL_LVL0_UP_THRESH_MASK,
 	.thermctl_lvl0_dn_thresh_mask = TEGRA210_THERMCTL_LVL0_DN_THRESH_MASK,
+	.can_throttle = false,
 };
 
 static const struct tegra_tsensor_group *tegra210_tsensor_groups[] = {
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 07/13] thermal: tegra: Use unsigned int where appropriate
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (5 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 06/13] thermal: tegra: Do not register cooling device Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 08/13] thermal: tegra: Avoid over-allocation of temporary array Thierry Reding
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Use unsigned integers more consistently, which helps to make it more
explicit about what values can be expected.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 105ec20d509d..c7f8e36cbeab 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -591,8 +591,9 @@ find_throttle_cfg_by_name(struct tegra_soctherm *ts, const char *name)
 
 static int tsensor_group_thermtrip_get(struct tegra_soctherm *ts, int id)
 {
-	int i, temp = min_low_temp;
 	struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
+	int temp = min_low_temp;
+	unsigned int i;
 
 	if (id >= TEGRA124_SOCTHERM_SENSOR_NUM)
 		return temp;
@@ -1469,33 +1470,34 @@ static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 {
 	struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
 	const int max_num_prop = ts->soc->num_ttgs * 2;
+	unsigned int i, j, count;
 	u32 *tlb;
-	int i, j, n, ret;
+	int ret;
 
 	if (!tt)
 		return -ENOMEM;
 
-	n = of_property_count_u32_elems(ts->dev->of_node, "nvidia,thermtrips");
-	if (n <= 0) {
+	ret = of_property_count_u32_elems(ts->dev->of_node, "nvidia,thermtrips");
+	if (ret <= 0) {
 		dev_info(ts->dev,
-			 "missing thermtrips, will use critical trips as shut down temp\n");
-		return n;
+			 "missing thermtrips, will use critical trips as shut down temperature\n");
+		return ret;
 	}
 
-	n = min(max_num_prop, n);
+	count = min_t(unsigned int, ret, ts->soc->num_ttgs * 2);
 
 	tlb = devm_kcalloc(ts->dev, max_num_prop, sizeof(u32), GFP_KERNEL);
 	if (!tlb)
 		return -ENOMEM;
+
 	ret = of_property_read_u32_array(ts->dev->of_node, "nvidia,thermtrips",
-					 tlb, n);
+					 tlb, count);
 	if (ret) {
 		dev_err(ts->dev, "invalid num ele: thermtrips:%d\n", ret);
 		return ret;
 	}
 
-	i = 0;
-	for (j = 0; j < n; j = j + 2) {
+	for (i = 0, j = 0; j < count; j = j + 2) {
 		if (tlb[j] >= TEGRA124_SOCTHERM_SENSOR_NUM)
 			continue;
 
@@ -1612,7 +1614,7 @@ static int soctherm_throt_cfg_parse(struct tegra_soctherm *ts,
 static void soctherm_init_hw_throttling(struct tegra_soctherm *tegra)
 {
 	struct device_node *np_stc, *np_stcc;
-	int i;
+	unsigned int i;
 
 	for (i = 0; i < THROTTLE_SIZE; i++) {
 		tegra->throt_cfgs[i].name = throt_names[i];
@@ -1875,8 +1877,8 @@ static void soctherm_throttle_program(struct tegra_soctherm *ts,
 
 static void tegra_soctherm_throttle(struct tegra_soctherm *ts)
 {
+	unsigned int i;
 	u32 v;
-	int i;
 
 	/* configure LOW, MED and HIGH levels for CCROC NV_THERM */
 	if (ts->soc->use_ccroc) {
@@ -1950,8 +1952,8 @@ static int soctherm_interrupts_init(struct tegra_soctherm *tegra)
 static void soctherm_init(struct tegra_soctherm *tegra)
 {
 	const struct tegra_tsensor_group **ttgs = tegra->soc->ttgs;
-	int i;
 	u32 pdiv, hotspot;
+	unsigned int i;
 
 	/* Initialize raw sensors */
 	for (i = 0; i < tegra->soc->num_tsensors; ++i)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 08/13] thermal: tegra: Avoid over-allocation of temporary array
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (6 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 07/13] thermal: tegra: Use unsigned int where appropriate Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 09/13] thermal: tegra: Remove gratuitous error assignment Thierry Reding
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The code will attempt to read "count" entries from DT, but the code
allocates the maximum number that is possible, potentially over-
allocating the array. Use the actual number of entries when allocating.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index c7f8e36cbeab..88ceeb8491cc 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -1469,7 +1469,6 @@ static int soctherm_clk_enable(struct tegra_soctherm *tegra, bool enable)
 static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 {
 	struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
-	const int max_num_prop = ts->soc->num_ttgs * 2;
 	unsigned int i, j, count;
 	u32 *tlb;
 	int ret;
@@ -1486,7 +1485,7 @@ static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 
 	count = min_t(unsigned int, ret, ts->soc->num_ttgs * 2);
 
-	tlb = devm_kcalloc(ts->dev, max_num_prop, sizeof(u32), GFP_KERNEL);
+	tlb = devm_kcalloc(ts->dev, count, sizeof(u32), GFP_KERNEL);
 	if (!tlb)
 		return -ENOMEM;
 
@@ -1506,6 +1505,8 @@ static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 		i++;
 	}
 
+	devm_kfree(ts->dev, tlb);
+
 	return 0;
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 09/13] thermal: tegra: Remove gratuitous error assignment
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (7 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 08/13] thermal: tegra: Avoid over-allocation of temporary array Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 10/13] thermal: tegra: Minor stylistic cleanups Thierry Reding
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Interrupts are optional, so errors during their initialization are
ignored. However, the code confusingly stores the error code and makes
it looks like it is somehow relevant. Remove the gratuitous assignment
to make it clearer that these errors are being purposefully ignored.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 88ceeb8491cc..77051d08e69f 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -2135,8 +2135,7 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 	}
 
 	soctherm_enable_hw_throttling(tegra);
-	err = soctherm_interrupts_init(tegra);
-
+	soctherm_interrupts_init(tegra);
 	soctherm_debug_init(tegra);
 
 	return 0;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 10/13] thermal: tegra: Minor stylistic cleanups
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (8 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 09/13] thermal: tegra: Remove gratuitous error assignment Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 11/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

Fix up some whitespace (padding, identation) issues and reword an
unreadable comment.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 77051d08e69f..f749fb960ebe 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -1492,7 +1492,9 @@ static int soctherm_thermtrips_parse(struct tegra_soctherm *ts)
 	ret = of_property_read_u32_array(ts->dev->of_node, "nvidia,thermtrips",
 					 tlb, count);
 	if (ret) {
-		dev_err(ts->dev, "invalid num ele: thermtrips:%d\n", ret);
+		dev_err(ts->dev,
+			"failed to read \"nvidia,thermtrips\" property: %d\n",
+			ret);
 		return ret;
 	}
 
@@ -1812,7 +1814,7 @@ static void throttlectl_gpu_level_select(struct tegra_soctherm *ts,
 }
 
 static int soctherm_oc_cfg_program(struct tegra_soctherm *ts,
-				      enum soctherm_throttle_id throt)
+				   enum soctherm_throttle_id throt)
 {
 	u32 r;
 	struct soctherm_oc_cfg *oc = &ts->throt_cfgs[throt].oc_cfg;
@@ -2104,8 +2106,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 	soctherm_init(tegra);
 
 	for (i = 0; i < soc->num_ttgs; ++i) {
-		struct tegra_thermctl_zone *zone =
-			devm_kzalloc(&pdev->dev, sizeof(*zone), GFP_KERNEL);
+		struct tegra_thermctl_zone *zone;
+
+		zone = devm_kzalloc(&pdev->dev, sizeof(*zone), GFP_KERNEL);
 		if (!zone) {
 			err = -ENOMEM;
 			goto disable_clocks;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 11/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (9 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 10/13] thermal: tegra: Minor stylistic cleanups Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 11/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The "heavy throttle" cooling device that SOCTHERM uses isn't a cooling
device in the traditional sense. It's an automatic mechanism that cannot
be actively controlled. Do not expose it as a cooling device and instead
of tying it to a specific trip point, hard-code the temperature at which
the automatic throttling will begin.

While at it, clean up the trip point names to reflect the names used by
the thermal device tree bindings.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm64/boot/dts/nvidia/tegra132.dtsi | 66 ++++++------------
 arch/arm64/boot/dts/nvidia/tegra210.dtsi | 86 +++++++-----------------
 2 files changed, 45 insertions(+), 107 deletions(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra132.dtsi b/arch/arm64/boot/dts/nvidia/tegra132.dtsi
index 7e24a212c7e4..105e01be341b 100644
--- a/arch/arm64/boot/dts/nvidia/tegra132.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra132.dtsi
@@ -878,11 +878,13 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				nvidia,thermal-zones = <&{/thermal-zones/cpu-thermal}>,
+						       <&{/thermal-zones/gpu-thermal}>;
+				temperature-millicelsius = <102000>;
+				hysteresis-millicelsius = <4000>;
 			};
 		};
 	};
@@ -1138,114 +1140,84 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu_shutdown_trip {
+				critical {
 					temperature = <105000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
 
-				cpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <102000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				mem_shutdown_trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
-				mem_throttle_trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 
 		gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu_shutdown_trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
 
-				gpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx_shutdown_trip {
+				critical {
 					temperature = <105000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
-				pllx_throttle_trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 47f8268e46bf..b340579594e4 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1329,12 +1329,14 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-percent = <85>;
 				nvidia,gpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				nvidia,thermal-zones = <&{/thermal-zones/cpu-thermal}>,
+						       <&{/thermal-zones/gpu-thermal}>;
+				temperature-millicelsius = <98500>;
+				hysteresis-millicelsius = <4000>;
 			};
 		};
 	};
@@ -2032,73 +2034,53 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu-shutdown-trip {
+				critical {
 					temperature = <102500>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				cpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <98500>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				dram_nominal: mem-nominal-trip {
-					temperature = <50000>;
-					hysteresis = <1000>;
-					type = "passive";
-				};
-
-				dram_throttle: mem-throttle-trip {
-					temperature = <70000>;
-					hysteresis = <1000>;
-					type = "active";
+				critical {
+					temperature = <103000>;
+					hysteresis = <0>;
+					type = "critical";
 				};
 
-				mem-hot-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 
-				mem-shutdown-trip {
-					temperature = <103000>;
-					hysteresis = <0>;
-					type = "critical";
+				emc_throttle_trip: passive {
+					temperature = <95000>;
+					hysteresis = <2000>;
+					type = "passive";
 				};
 			};
 
 			cooling-maps {
-				dram-passive {
-					cooling-device = <&emc 0 0>;
-					trip = <&dram_nominal>;
-				};
-
-				dram-active {
+				map-passive {
 					cooling-device = <&emc 1 1>;
-					trip = <&dram_throttle>;
+					trip = <&emc_throttle_trip>;
 				};
 			};
 		};
@@ -2107,58 +2089,42 @@ gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				gpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				pllx-throttle-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 11/13] ARM: tegra: Rework SOCTHERM on Tegra124
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (10 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 11/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 12/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The "heavy throttle" cooling device that SOCTHERM uses isn't a cooling
device in the traditional sense. It's an automatic mechanism that cannot
be actively controlled. Do not expose it as a cooling device and instead
of tying it to a specific trip point, hard-code the temperature at which
the automatic throttling will begin.

While at it, clean up the trip point names to reflect the names used by
the thermal device tree bindings.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- update for DT bindings changes

 arch/arm/boot/dts/nvidia/tegra124.dtsi | 68 ++++++++------------------
 1 file changed, 21 insertions(+), 47 deletions(-)

diff --git a/arch/arm/boot/dts/nvidia/tegra124.dtsi b/arch/arm/boot/dts/nvidia/tegra124.dtsi
index 8f1fff373461..6af944fe6769 100644
--- a/arch/arm/boot/dts/nvidia/tegra124.dtsi
+++ b/arch/arm/boot/dts/nvidia/tegra124.dtsi
@@ -928,12 +928,14 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-percent = <85>;
 				nvidia,gpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				nvidia,thermal-zones = <&{/thermal-zones/cpu-thermal}>,
+						       <&{/thermal-zones/gpu-thermal}>;
+				temperature-millicelsius = <100000>;
+				hysteresis-millicelsius = <4000>;
 			};
 		};
 	};
@@ -1238,112 +1240,84 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				cpu_throttle_trip: throttle-trip {
+
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				mem-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				mem-throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 
 		gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu-shutdown-trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				gpu_throttle_trip: throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				pllx-throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 12/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (11 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 11/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 12/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The "heavy throttle" cooling device that SOCTHERM uses isn't a cooling
device in the traditional sense. It's an automatic mechanism that cannot
be actively controlled. Do not expose it as a cooling device and instead
of tying it to a specific trip point, hard-code the temperature at which
the automatic throttling will begin.

While at it, clean up the trip point names to reflect the names used by
the thermal device tree bindings.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- update for DT bindings changes

 arch/arm64/boot/dts/nvidia/tegra132.dtsi | 66 ++++++------------
 arch/arm64/boot/dts/nvidia/tegra210.dtsi | 86 +++++++-----------------
 2 files changed, 45 insertions(+), 107 deletions(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra132.dtsi b/arch/arm64/boot/dts/nvidia/tegra132.dtsi
index 7e24a212c7e4..105e01be341b 100644
--- a/arch/arm64/boot/dts/nvidia/tegra132.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra132.dtsi
@@ -878,11 +878,13 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				nvidia,thermal-zones = <&{/thermal-zones/cpu-thermal}>,
+						       <&{/thermal-zones/gpu-thermal}>;
+				temperature-millicelsius = <102000>;
+				hysteresis-millicelsius = <4000>;
 			};
 		};
 	};
@@ -1138,114 +1140,84 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu_shutdown_trip {
+				critical {
 					temperature = <105000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
 
-				cpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <102000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				mem_shutdown_trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
-				mem_throttle_trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 
 		gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu_shutdown_trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
 
-				gpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx_shutdown_trip {
+				critical {
 					temperature = <105000>;
 					hysteresis = <1000>;
 					type = "critical";
 				};
-				pllx_throttle_trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 47f8268e46bf..b340579594e4 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1329,12 +1329,14 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-percent = <85>;
 				nvidia,gpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				nvidia,thermal-zones = <&{/thermal-zones/cpu-thermal}>,
+						       <&{/thermal-zones/gpu-thermal}>;
+				temperature-millicelsius = <98500>;
+				hysteresis-millicelsius = <4000>;
 			};
 		};
 	};
@@ -2032,73 +2034,53 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu-shutdown-trip {
+				critical {
 					temperature = <102500>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				cpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <98500>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				dram_nominal: mem-nominal-trip {
-					temperature = <50000>;
-					hysteresis = <1000>;
-					type = "passive";
-				};
-
-				dram_throttle: mem-throttle-trip {
-					temperature = <70000>;
-					hysteresis = <1000>;
-					type = "active";
+				critical {
+					temperature = <103000>;
+					hysteresis = <0>;
+					type = "critical";
 				};
 
-				mem-hot-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 
-				mem-shutdown-trip {
-					temperature = <103000>;
-					hysteresis = <0>;
-					type = "critical";
+				emc_throttle_trip: passive {
+					temperature = <95000>;
+					hysteresis = <2000>;
+					type = "passive";
 				};
 			};
 
 			cooling-maps {
-				dram-passive {
-					cooling-device = <&emc 0 0>;
-					trip = <&dram_nominal>;
-				};
-
-				dram-active {
+				map-passive {
 					cooling-device = <&emc 1 1>;
-					trip = <&dram_throttle>;
+					trip = <&emc_throttle_trip>;
 				};
 			};
 		};
@@ -2107,58 +2089,42 @@ gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				gpu_throttle_trip: throttle-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <0>;
 			polling-delay = <0>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
 
-				pllx-throttle-trip {
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 12/13] ARM: tegra: Rework SOCTHERM on Tegra124
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (12 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 12/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-12 17:58 ` [PATCH v2 13/13] thermal: Enforce self-encapsulation Thierry Reding
  2023-10-13  9:14 ` [PATCH v2 00/13] thermal: tegra: Do not register cooling device Nicolas Chauvet
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

The "heavy throttle" cooling device that SOCTHERM uses isn't a cooling
device in the traditional sense. It's an automatic mechanism that cannot
be actively controlled. Do not expose it as a cooling device and instead
of tying it to a specific trip point, hard-code the temperature at which
the automatic throttling will begin.

While at it, clean up the trip point names to reflect the names used by
the thermal device tree bindings.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 arch/arm/boot/dts/nvidia/tegra124.dtsi | 65 +++++++-------------------
 1 file changed, 18 insertions(+), 47 deletions(-)

diff --git a/arch/arm/boot/dts/nvidia/tegra124.dtsi b/arch/arm/boot/dts/nvidia/tegra124.dtsi
index 8f1fff373461..ea692eb09067 100644
--- a/arch/arm/boot/dts/nvidia/tegra124.dtsi
+++ b/arch/arm/boot/dts/nvidia/tegra124.dtsi
@@ -928,12 +928,11 @@ soctherm: thermal-sensor@700e2000 {
 		#thermal-sensor-cells = <1>;
 
 		throttle-cfgs {
-			throttle_heavy: heavy {
+			heavy {
 				nvidia,priority = <100>;
 				nvidia,cpu-throt-percent = <85>;
 				nvidia,gpu-throt-level = <TEGRA_SOCTHERM_THROT_LEVEL_HIGH>;
-
-				#cooling-cells = <2>;
+				temperature = <100000>;
 			};
 		};
 	};
@@ -1238,112 +1237,84 @@ cpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
 			trips {
-				cpu-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				cpu_throttle_trip: throttle-trip {
+
+				hot {
 					temperature = <100000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&cpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		mem-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
 			trips {
-				mem-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				mem-throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 
 		gpu-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
 			trips {
-				gpu-shutdown-trip {
+				critical {
 					temperature = <101000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				gpu_throttle_trip: throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_throttle_trip>;
-					cooling-device = <&throttle_heavy 1 1>;
-				};
-			};
 		};
 
 		pllx-thermal {
 			polling-delay-passive = <1000>;
 			polling-delay = <1000>;
 
-			thermal-sensors =
-				<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
+			thermal-sensors = <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
 			trips {
-				pllx-shutdown-trip {
+				critical {
 					temperature = <103000>;
 					hysteresis = <0>;
 					type = "critical";
 				};
-				pllx-throttle-trip {
+
+				hot {
 					temperature = <99000>;
 					hysteresis = <1000>;
 					type = "hot";
 				};
 			};
-
-			cooling-maps {
-				/*
-				 * There are currently no cooling maps,
-				 * because there are no cooling devices.
-				 */
-			};
 		};
 	};
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 13/13] thermal: Enforce self-encapsulation
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (13 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 12/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
@ 2023-10-12 17:58 ` Thierry Reding
  2023-10-13  9:14 ` [PATCH v2 00/13] thermal: tegra: Do not register cooling device Nicolas Chauvet
  15 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-12 17:58 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, Thierry Reding, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

From: Thierry Reding <treding@nvidia.com>

With the last driver being fixed to not reach into the core anymore,
make sure that no new drivers will be able to do so by generating a
build error when they try to do so.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/thermal_core.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 91d9d4f63609..9c599658b866 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -15,7 +15,7 @@
 #include "thermal_netlink.h"
 
 #ifndef THERMAL_CORE_SUBSYS
-#warning This header can only be included by the thermal core code
+#error This header can only be included by the thermal core code
 #endif
 
 /* Default Thermal Governor */
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently
  2023-10-12 17:58 ` [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently Thierry Reding
@ 2023-10-13  8:04   ` Daniel Lezcano
  2023-10-13 11:40     ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Lezcano @ 2023-10-13  8:04 UTC (permalink / raw)
  To: Thierry Reding, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

On 12/10/2023 19:58, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Instead of passing around platform and plain devices and figuring out
> the driver-private data within each helper, directly pass around the
> driver-private data when it's available.
> 
> Also store a pointer to the parent device in the main driver-private
> data structure for easier access.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---

[ ... ]

> -static void soctherm_debug_init(struct platform_device *pdev)
> +static void soctherm_debug_init(struct tegra_soctherm *tegra)
>   {
> -	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
>   	struct dentry *root;
>   
>   	root = debugfs_create_dir("soctherm", NULL);
>   
>   	tegra->debugfs_dir = root;
>   
> -	debugfs_create_file("reg_contents", 0644, root, pdev, &regs_fops);
> +	debugfs_create_file("reg_contents", 0644, root, tegra, &regs_fops);

(Orthogonal to this series) : in case you are not aware of it there is 
the debugfs_create_regset32() function. That may make go away a bunch of 
code related to the debugfs code here.

cf. 
https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/tree/drivers/thermal/mediatek/lvts_thermal.c?h=thermal/bleeding-edge#n159

[ ... ]

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
                   ` (14 preceding siblings ...)
  2023-10-12 17:58 ` [PATCH v2 13/13] thermal: Enforce self-encapsulation Thierry Reding
@ 2023-10-13  9:14 ` Nicolas Chauvet
  2023-10-13 11:43   ` Thierry Reding
  15 siblings, 1 reply; 30+ messages in thread
From: Nicolas Chauvet @ 2023-10-13  9:14 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
<thierry.reding@gmail.com> a écrit :
>
> From: Thierry Reding <treding@nvidia.com>
>
> Hi,
>
> this set of patches removes the registration of the SOCTHERM internal
> throttling mechanism as cooling device. Since this throttling starts
> automatically once a certain temperature threshold is crossed, it
> doesn't make sense to represent it as a cooling device, which are
> typically "manually" activated by the thermal framework when thermal
> sensors report temperature thresholds being crossed.
>
> Instead of using the cooling device mechanism, this statically programs
> the throttling mechanism when it is configured in device tree. In order
> to do this, an additional device tree property is needed to replace the
> information that was previously contained in trip points.
>
> There's a few preparatory patches to make the removal a bit simpler and
> also some follow up cleanups included as well.
>
> Changes in v2:
> - rework the device tree bindings:
>   - add nvidia,thermal-zones property to attach throttling to zones
>   - use -millicelsius suffix and add hysteresis
> - add patch to store thermal zone device tree node for later use
> - add patch to enforce self-encapsulation of the thermal core now that
>   no drivers need to reach into it anymore
>
> This applies on top of Daniel's self-encapsulation hardening series:
>
>         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
>
> Thierry
>
> Thierry Reding (13):
>   thermal: Store device tree node for thermal zone devices
>   dt-bindings: thermal: tegra: Document throttle temperature
>   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
>   thermal: tegra: Use driver-private data consistently
>   thermal: tegra: Constify SoC-specific data
>   thermal: tegra: Do not register cooling device
>   thermal: tegra: Use unsigned int where appropriate
>   thermal: tegra: Avoid over-allocation of temporary array
>   thermal: tegra: Remove gratuitous error assignment
>   thermal: tegra: Minor stylistic cleanups
>   ARM: tegra: Rework SOCTHERM on Tegra124
>   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
>   thermal: Enforce self-encapsulation
>
>  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
>  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
>  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
>  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
>  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
>  drivers/thermal/tegra/soctherm.h              |   1 +
>  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
>  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
>  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
>  drivers/thermal/thermal_core.h                |   2 +-
>  drivers/thermal/thermal_of.c                  |   3 +
>  11 files changed, 329 insertions(+), 453 deletions(-)
>
> --
> 2.42.0
>

I'm still experiencing the following message on jetson-tx1 with this
serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
applied).
oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
Failed to register thermal zone: -19
oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
prop

Is this expected ?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently
  2023-10-13  8:04   ` Daniel Lezcano
@ 2023-10-13 11:40     ` Thierry Reding
  0 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-13 11:40 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J . Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm,
	devicetree, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 1790 bytes --]

On Fri, Oct 13, 2023 at 10:04:33AM +0200, Daniel Lezcano wrote:
> On 12/10/2023 19:58, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Instead of passing around platform and plain devices and figuring out
> > the driver-private data within each helper, directly pass around the
> > driver-private data when it's available.
> > 
> > Also store a pointer to the parent device in the main driver-private
> > data structure for easier access.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> 
> [ ... ]
> 
> > -static void soctherm_debug_init(struct platform_device *pdev)
> > +static void soctherm_debug_init(struct tegra_soctherm *tegra)
> >   {
> > -	struct tegra_soctherm *tegra = platform_get_drvdata(pdev);
> >   	struct dentry *root;
> >   	root = debugfs_create_dir("soctherm", NULL);
> >   	tegra->debugfs_dir = root;
> > -	debugfs_create_file("reg_contents", 0644, root, pdev, &regs_fops);
> > +	debugfs_create_file("reg_contents", 0644, root, tegra, &regs_fops);
> 
> (Orthogonal to this series) : in case you are not aware of it there is the
> debugfs_create_regset32() function. That may make go away a bunch of code
> related to the debugfs code here.
> 
> cf. https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/tree/drivers/thermal/mediatek/lvts_thermal.c?h=thermal/bleeding-edge#n159

Doesn't look like we could 1:1 transition to that because a bunch of the
output is parameterized on some per-SoC variables, but it might be
possible to rework the whole debugfs interfaces and split it up into
multiple directories and files, much like LVTS does this. That's going
to be quite a bit of work, but I can add it to my TODO list to look at
when I get some spare time.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-13  9:14 ` [PATCH v2 00/13] thermal: tegra: Do not register cooling device Nicolas Chauvet
@ 2023-10-13 11:43   ` Thierry Reding
  2023-10-13 12:45     ` Nicolas Chauvet
  0 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-13 11:43 UTC (permalink / raw)
  To: Nicolas Chauvet
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

[-- Attachment #1: Type: text/plain, Size: 4089 bytes --]

On Fri, Oct 13, 2023 at 11:14:25AM +0200, Nicolas Chauvet wrote:
> Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
> <thierry.reding@gmail.com> a écrit :
> >
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Hi,
> >
> > this set of patches removes the registration of the SOCTHERM internal
> > throttling mechanism as cooling device. Since this throttling starts
> > automatically once a certain temperature threshold is crossed, it
> > doesn't make sense to represent it as a cooling device, which are
> > typically "manually" activated by the thermal framework when thermal
> > sensors report temperature thresholds being crossed.
> >
> > Instead of using the cooling device mechanism, this statically programs
> > the throttling mechanism when it is configured in device tree. In order
> > to do this, an additional device tree property is needed to replace the
> > information that was previously contained in trip points.
> >
> > There's a few preparatory patches to make the removal a bit simpler and
> > also some follow up cleanups included as well.
> >
> > Changes in v2:
> > - rework the device tree bindings:
> >   - add nvidia,thermal-zones property to attach throttling to zones
> >   - use -millicelsius suffix and add hysteresis
> > - add patch to store thermal zone device tree node for later use
> > - add patch to enforce self-encapsulation of the thermal core now that
> >   no drivers need to reach into it anymore
> >
> > This applies on top of Daniel's self-encapsulation hardening series:
> >
> >         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
> >
> > Thierry
> >
> > Thierry Reding (13):
> >   thermal: Store device tree node for thermal zone devices
> >   dt-bindings: thermal: tegra: Document throttle temperature
> >   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
> >   thermal: tegra: Use driver-private data consistently
> >   thermal: tegra: Constify SoC-specific data
> >   thermal: tegra: Do not register cooling device
> >   thermal: tegra: Use unsigned int where appropriate
> >   thermal: tegra: Avoid over-allocation of temporary array
> >   thermal: tegra: Remove gratuitous error assignment
> >   thermal: tegra: Minor stylistic cleanups
> >   ARM: tegra: Rework SOCTHERM on Tegra124
> >   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
> >   thermal: Enforce self-encapsulation
> >
> >  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
> >  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
> >  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
> >  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
> >  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
> >  drivers/thermal/tegra/soctherm.h              |   1 +
> >  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
> >  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
> >  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
> >  drivers/thermal/thermal_core.h                |   2 +-
> >  drivers/thermal/thermal_of.c                  |   3 +
> >  11 files changed, 329 insertions(+), 453 deletions(-)
> >
> > --
> > 2.42.0
> >
> 
> I'm still experiencing the following message on jetson-tx1 with this
> serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
> applied).
> oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
> Failed to register thermal zone: -19

Not sure about this one. I don't think I've seen it. Do you know if
that's somehow caused by this series, or is it preexisting?

> oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
> 700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
> prop
> 
> Is this expected ?

This one is definitely not expected. I have seen this before, and it
happens when the device tree doesn't contain all the properties that are
expected. Patch 12 in this series should take care of that. Have you
made sure that that's been applied and is what the kernel uses to boot
with?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-13 11:43   ` Thierry Reding
@ 2023-10-13 12:45     ` Nicolas Chauvet
  2023-10-13 13:13       ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Chauvet @ 2023-10-13 12:45 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

Le ven. 13 oct. 2023 à 13:43, Thierry Reding
<thierry.reding@gmail.com> a écrit :
>
> On Fri, Oct 13, 2023 at 11:14:25AM +0200, Nicolas Chauvet wrote:
> > Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
> > <thierry.reding@gmail.com> a écrit :
> > >
> > > From: Thierry Reding <treding@nvidia.com>
> > >
> > > Hi,
> > >
> > > this set of patches removes the registration of the SOCTHERM internal
> > > throttling mechanism as cooling device. Since this throttling starts
> > > automatically once a certain temperature threshold is crossed, it
> > > doesn't make sense to represent it as a cooling device, which are
> > > typically "manually" activated by the thermal framework when thermal
> > > sensors report temperature thresholds being crossed.
> > >
> > > Instead of using the cooling device mechanism, this statically programs
> > > the throttling mechanism when it is configured in device tree. In order
> > > to do this, an additional device tree property is needed to replace the
> > > information that was previously contained in trip points.
> > >
> > > There's a few preparatory patches to make the removal a bit simpler and
> > > also some follow up cleanups included as well.
> > >
> > > Changes in v2:
> > > - rework the device tree bindings:
> > >   - add nvidia,thermal-zones property to attach throttling to zones
> > >   - use -millicelsius suffix and add hysteresis
> > > - add patch to store thermal zone device tree node for later use
> > > - add patch to enforce self-encapsulation of the thermal core now that
> > >   no drivers need to reach into it anymore
> > >
> > > This applies on top of Daniel's self-encapsulation hardening series:
> > >
> > >         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
> > >
> > > Thierry
> > >
> > > Thierry Reding (13):
> > >   thermal: Store device tree node for thermal zone devices
> > >   dt-bindings: thermal: tegra: Document throttle temperature
> > >   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
> > >   thermal: tegra: Use driver-private data consistently
> > >   thermal: tegra: Constify SoC-specific data
> > >   thermal: tegra: Do not register cooling device
> > >   thermal: tegra: Use unsigned int where appropriate
> > >   thermal: tegra: Avoid over-allocation of temporary array
> > >   thermal: tegra: Remove gratuitous error assignment
> > >   thermal: tegra: Minor stylistic cleanups
> > >   ARM: tegra: Rework SOCTHERM on Tegra124
> > >   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
> > >   thermal: Enforce self-encapsulation
> > >
> > >  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
> > >  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
> > >  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
> > >  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
> > >  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
> > >  drivers/thermal/tegra/soctherm.h              |   1 +
> > >  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
> > >  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
> > >  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
> > >  drivers/thermal/thermal_core.h                |   2 +-
> > >  drivers/thermal/thermal_of.c                  |   3 +
> > >  11 files changed, 329 insertions(+), 453 deletions(-)
> > >
> > > --
> > > 2.42.0
> > >
> >
> > I'm still experiencing the following message on jetson-tx1 with this
> > serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
> > applied).
> > oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
> > Failed to register thermal zone: -19
>
> Not sure about this one. I don't think I've seen it. Do you know if
> that's somehow caused by this series, or is it preexisting?

It's pre-existing from this serie.

> > oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
> > 700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
> > prop
> >
> > Is this expected ?
>
> This one is definitely not expected. I have seen this before, and it
> happens when the device tree doesn't contain all the properties that are
> expected. Patch 12 in this series should take care of that. Have you
> made sure that that's been applied and is what the kernel uses to boot
> with?

Yes, this dtb change in patch12 is propagated to the device (as seen
in /boot/dtbs)
But comparing with what's available at runtime in /proc/device-tree, I
see some changes

                        heavy {
-                               hysteresis-millicelsius = <0xfa0>;
+                               #cooling-cells = <0x02>;
                                nvidia,cpu-throt-percent = <0x55>;
                                nvidia,gpu-throt-level = <0x03>;
                                nvidia,priority = <0x64>;
-                               nvidia,thermal-zones = <0x49 0x4a>;
-                               temperature-millicelsius = <0x180c4>;
+                               phandle = <0x130>;
                        };

I'm using u-boot 2023.07 with EFI boot (L4T 32.7.4).
Could it be that the bootloader has changed these entries ? Can this
be prevented ?
(MAC ethernet address is set as appropropriate).

Thanks

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-13 12:45     ` Nicolas Chauvet
@ 2023-10-13 13:13       ` Thierry Reding
  2023-10-13 13:55         ` Nicolas Chauvet
  0 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-10-13 13:13 UTC (permalink / raw)
  To: Nicolas Chauvet
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

[-- Attachment #1: Type: text/plain, Size: 6054 bytes --]

On Fri, Oct 13, 2023 at 02:45:57PM +0200, Nicolas Chauvet wrote:
> Le ven. 13 oct. 2023 à 13:43, Thierry Reding
> <thierry.reding@gmail.com> a écrit :
> >
> > On Fri, Oct 13, 2023 at 11:14:25AM +0200, Nicolas Chauvet wrote:
> > > Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
> > > <thierry.reding@gmail.com> a écrit :
> > > >
> > > > From: Thierry Reding <treding@nvidia.com>
> > > >
> > > > Hi,
> > > >
> > > > this set of patches removes the registration of the SOCTHERM internal
> > > > throttling mechanism as cooling device. Since this throttling starts
> > > > automatically once a certain temperature threshold is crossed, it
> > > > doesn't make sense to represent it as a cooling device, which are
> > > > typically "manually" activated by the thermal framework when thermal
> > > > sensors report temperature thresholds being crossed.
> > > >
> > > > Instead of using the cooling device mechanism, this statically programs
> > > > the throttling mechanism when it is configured in device tree. In order
> > > > to do this, an additional device tree property is needed to replace the
> > > > information that was previously contained in trip points.
> > > >
> > > > There's a few preparatory patches to make the removal a bit simpler and
> > > > also some follow up cleanups included as well.
> > > >
> > > > Changes in v2:
> > > > - rework the device tree bindings:
> > > >   - add nvidia,thermal-zones property to attach throttling to zones
> > > >   - use -millicelsius suffix and add hysteresis
> > > > - add patch to store thermal zone device tree node for later use
> > > > - add patch to enforce self-encapsulation of the thermal core now that
> > > >   no drivers need to reach into it anymore
> > > >
> > > > This applies on top of Daniel's self-encapsulation hardening series:
> > > >
> > > >         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
> > > >
> > > > Thierry
> > > >
> > > > Thierry Reding (13):
> > > >   thermal: Store device tree node for thermal zone devices
> > > >   dt-bindings: thermal: tegra: Document throttle temperature
> > > >   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
> > > >   thermal: tegra: Use driver-private data consistently
> > > >   thermal: tegra: Constify SoC-specific data
> > > >   thermal: tegra: Do not register cooling device
> > > >   thermal: tegra: Use unsigned int where appropriate
> > > >   thermal: tegra: Avoid over-allocation of temporary array
> > > >   thermal: tegra: Remove gratuitous error assignment
> > > >   thermal: tegra: Minor stylistic cleanups
> > > >   ARM: tegra: Rework SOCTHERM on Tegra124
> > > >   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
> > > >   thermal: Enforce self-encapsulation
> > > >
> > > >  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
> > > >  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
> > > >  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
> > > >  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
> > > >  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
> > > >  drivers/thermal/tegra/soctherm.h              |   1 +
> > > >  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
> > > >  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
> > > >  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
> > > >  drivers/thermal/thermal_core.h                |   2 +-
> > > >  drivers/thermal/thermal_of.c                  |   3 +
> > > >  11 files changed, 329 insertions(+), 453 deletions(-)
> > > >
> > > > --
> > > > 2.42.0
> > > >
> > >
> > > I'm still experiencing the following message on jetson-tx1 with this
> > > serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
> > > applied).
> > > oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
> > > Failed to register thermal zone: -19
> >
> > Not sure about this one. I don't think I've seen it. Do you know if
> > that's somehow caused by this series, or is it preexisting?
> 
> It's pre-existing from this serie.
> 
> > > oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
> > > 700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
> > > prop
> > >
> > > Is this expected ?
> >
> > This one is definitely not expected. I have seen this before, and it
> > happens when the device tree doesn't contain all the properties that are
> > expected. Patch 12 in this series should take care of that. Have you
> > made sure that that's been applied and is what the kernel uses to boot
> > with?
> 
> Yes, this dtb change in patch12 is propagated to the device (as seen
> in /boot/dtbs)
> But comparing with what's available at runtime in /proc/device-tree, I
> see some changes
> 
>                         heavy {
> -                               hysteresis-millicelsius = <0xfa0>;
> +                               #cooling-cells = <0x02>;
>                                 nvidia,cpu-throt-percent = <0x55>;
>                                 nvidia,gpu-throt-level = <0x03>;
>                                 nvidia,priority = <0x64>;
> -                               nvidia,thermal-zones = <0x49 0x4a>;
> -                               temperature-millicelsius = <0x180c4>;
> +                               phandle = <0x130>;
>                         };

Okay, that explains the error message.

> 
> I'm using u-boot 2023.07 with EFI boot (L4T 32.7.4).
> Could it be that the bootloader has changed these entries ? Can this
> be prevented ?

I'm not aware of anything in the bootloader that would do this. Some
versions of U-Boot that ships with L4T can copy certain nodes in DTB but
I have never seen anything that would've touched thermal.

Is it possible that you're not loading the DTB and end up receiving one
from UEFI or cboot?

> (MAC ethernet address is set as appropropriate).

That's a completely separate mechanism and shouldn't touch thermal at
all.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-13 13:13       ` Thierry Reding
@ 2023-10-13 13:55         ` Nicolas Chauvet
  2023-10-13 15:45           ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Chauvet @ 2023-10-13 13:55 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

Le ven. 13 oct. 2023 à 15:13, Thierry Reding
<thierry.reding@gmail.com> a écrit :
>
> On Fri, Oct 13, 2023 at 02:45:57PM +0200, Nicolas Chauvet wrote:
> > Le ven. 13 oct. 2023 à 13:43, Thierry Reding
> > <thierry.reding@gmail.com> a écrit :
> > >
> > > On Fri, Oct 13, 2023 at 11:14:25AM +0200, Nicolas Chauvet wrote:
> > > > Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
> > > > <thierry.reding@gmail.com> a écrit :
> > > > >
> > > > > From: Thierry Reding <treding@nvidia.com>
> > > > >
> > > > > Hi,
> > > > >
> > > > > this set of patches removes the registration of the SOCTHERM internal
> > > > > throttling mechanism as cooling device. Since this throttling starts
> > > > > automatically once a certain temperature threshold is crossed, it
> > > > > doesn't make sense to represent it as a cooling device, which are
> > > > > typically "manually" activated by the thermal framework when thermal
> > > > > sensors report temperature thresholds being crossed.
> > > > >
> > > > > Instead of using the cooling device mechanism, this statically programs
> > > > > the throttling mechanism when it is configured in device tree. In order
> > > > > to do this, an additional device tree property is needed to replace the
> > > > > information that was previously contained in trip points.
> > > > >
> > > > > There's a few preparatory patches to make the removal a bit simpler and
> > > > > also some follow up cleanups included as well.
> > > > >
> > > > > Changes in v2:
> > > > > - rework the device tree bindings:
> > > > >   - add nvidia,thermal-zones property to attach throttling to zones
> > > > >   - use -millicelsius suffix and add hysteresis
> > > > > - add patch to store thermal zone device tree node for later use
> > > > > - add patch to enforce self-encapsulation of the thermal core now that
> > > > >   no drivers need to reach into it anymore
> > > > >
> > > > > This applies on top of Daniel's self-encapsulation hardening series:
> > > > >
> > > > >         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
> > > > >
> > > > > Thierry
> > > > >
> > > > > Thierry Reding (13):
> > > > >   thermal: Store device tree node for thermal zone devices
> > > > >   dt-bindings: thermal: tegra: Document throttle temperature
> > > > >   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
> > > > >   thermal: tegra: Use driver-private data consistently
> > > > >   thermal: tegra: Constify SoC-specific data
> > > > >   thermal: tegra: Do not register cooling device
> > > > >   thermal: tegra: Use unsigned int where appropriate
> > > > >   thermal: tegra: Avoid over-allocation of temporary array
> > > > >   thermal: tegra: Remove gratuitous error assignment
> > > > >   thermal: tegra: Minor stylistic cleanups
> > > > >   ARM: tegra: Rework SOCTHERM on Tegra124
> > > > >   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
> > > > >   thermal: Enforce self-encapsulation
> > > > >
> > > > >  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
> > > > >  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
> > > > >  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
> > > > >  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
> > > > >  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
> > > > >  drivers/thermal/tegra/soctherm.h              |   1 +
> > > > >  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
> > > > >  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
> > > > >  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
> > > > >  drivers/thermal/thermal_core.h                |   2 +-
> > > > >  drivers/thermal/thermal_of.c                  |   3 +
> > > > >  11 files changed, 329 insertions(+), 453 deletions(-)
> > > > >
> > > > > --
> > > > > 2.42.0
> > > > >
> > > >
> > > > I'm still experiencing the following message on jetson-tx1 with this
> > > > serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
> > > > applied).
> > > > oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
> > > > Failed to register thermal zone: -19
> > >
> > > Not sure about this one. I don't think I've seen it. Do you know if
> > > that's somehow caused by this series, or is it preexisting?
> >
> > It's pre-existing from this serie.
> >
> > > > oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
> > > > 700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
> > > > prop
> > > >
> > > > Is this expected ?
> > >
> > > This one is definitely not expected. I have seen this before, and it
> > > happens when the device tree doesn't contain all the properties that are
> > > expected. Patch 12 in this series should take care of that. Have you
> > > made sure that that's been applied and is what the kernel uses to boot
> > > with?
> >
> > Yes, this dtb change in patch12 is propagated to the device (as seen
> > in /boot/dtbs)
> > But comparing with what's available at runtime in /proc/device-tree, I
> > see some changes
> >
> >                         heavy {
> > -                               hysteresis-millicelsius = <0xfa0>;
> > +                               #cooling-cells = <0x02>;
> >                                 nvidia,cpu-throt-percent = <0x55>;
> >                                 nvidia,gpu-throt-level = <0x03>;
> >                                 nvidia,priority = <0x64>;
> > -                               nvidia,thermal-zones = <0x49 0x4a>;
> > -                               temperature-millicelsius = <0x180c4>;
> > +                               phandle = <0x130>;
> >                         };
>
> Okay, that explains the error message.
>
> >
> > I'm using u-boot 2023.07 with EFI boot (L4T 32.7.4).
> > Could it be that the bootloader has changed these entries ? Can this
> > be prevented ?
>
> I'm not aware of anything in the bootloader that would do this. Some
> versions of U-Boot that ships with L4T can copy certain nodes in DTB but
> I have never seen anything that would've touched thermal.
>
> Is it possible that you're not loading the DTB and end up receiving one
> from UEFI or cboot?
Seems like it: a previous copy was in /boot/dts that took over /boot/dtbs.
With updated dtb, the warning disappeared.

Only remaining error is: kernel: max77620-thermal max77620-thermal:
Failed to register thermal zone: -19

Thanks

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] thermal: tegra: Do not register cooling device
  2023-10-13 13:55         ` Nicolas Chauvet
@ 2023-10-13 15:45           ` Thierry Reding
  0 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-10-13 15:45 UTC (permalink / raw)
  To: Nicolas Chauvet
  Cc: Daniel Lezcano, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Amit Kucheria, Zhang Rui,
	Jon Hunter, linux-pm, linux-tegra, devicetree

[-- Attachment #1: Type: text/plain, Size: 7097 bytes --]

On Fri, Oct 13, 2023 at 03:55:43PM +0200, Nicolas Chauvet wrote:
> Le ven. 13 oct. 2023 à 15:13, Thierry Reding
> <thierry.reding@gmail.com> a écrit :
> >
> > On Fri, Oct 13, 2023 at 02:45:57PM +0200, Nicolas Chauvet wrote:
> > > Le ven. 13 oct. 2023 à 13:43, Thierry Reding
> > > <thierry.reding@gmail.com> a écrit :
> > > >
> > > > On Fri, Oct 13, 2023 at 11:14:25AM +0200, Nicolas Chauvet wrote:
> > > > > Le jeu. 12 oct. 2023 à 19:58, Thierry Reding
> > > > > <thierry.reding@gmail.com> a écrit :
> > > > > >
> > > > > > From: Thierry Reding <treding@nvidia.com>
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > this set of patches removes the registration of the SOCTHERM internal
> > > > > > throttling mechanism as cooling device. Since this throttling starts
> > > > > > automatically once a certain temperature threshold is crossed, it
> > > > > > doesn't make sense to represent it as a cooling device, which are
> > > > > > typically "manually" activated by the thermal framework when thermal
> > > > > > sensors report temperature thresholds being crossed.
> > > > > >
> > > > > > Instead of using the cooling device mechanism, this statically programs
> > > > > > the throttling mechanism when it is configured in device tree. In order
> > > > > > to do this, an additional device tree property is needed to replace the
> > > > > > information that was previously contained in trip points.
> > > > > >
> > > > > > There's a few preparatory patches to make the removal a bit simpler and
> > > > > > also some follow up cleanups included as well.
> > > > > >
> > > > > > Changes in v2:
> > > > > > - rework the device tree bindings:
> > > > > >   - add nvidia,thermal-zones property to attach throttling to zones
> > > > > >   - use -millicelsius suffix and add hysteresis
> > > > > > - add patch to store thermal zone device tree node for later use
> > > > > > - add patch to enforce self-encapsulation of the thermal core now that
> > > > > >   no drivers need to reach into it anymore
> > > > > >
> > > > > > This applies on top of Daniel's self-encapsulation hardening series:
> > > > > >
> > > > > >         https://lore.kernel.org/all/20231012102700.2858952-1-daniel.lezcano@linaro.org/
> > > > > >
> > > > > > Thierry
> > > > > >
> > > > > > Thierry Reding (13):
> > > > > >   thermal: Store device tree node for thermal zone devices
> > > > > >   dt-bindings: thermal: tegra: Document throttle temperature
> > > > > >   dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
> > > > > >   thermal: tegra: Use driver-private data consistently
> > > > > >   thermal: tegra: Constify SoC-specific data
> > > > > >   thermal: tegra: Do not register cooling device
> > > > > >   thermal: tegra: Use unsigned int where appropriate
> > > > > >   thermal: tegra: Avoid over-allocation of temporary array
> > > > > >   thermal: tegra: Remove gratuitous error assignment
> > > > > >   thermal: tegra: Minor stylistic cleanups
> > > > > >   ARM: tegra: Rework SOCTHERM on Tegra124
> > > > > >   arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210
> > > > > >   thermal: Enforce self-encapsulation
> > > > > >
> > > > > >  .../thermal/nvidia,tegra124-soctherm.yaml     |  19 +
> > > > > >  arch/arm/boot/dts/nvidia/tegra124.dtsi        |  68 +--
> > > > > >  arch/arm64/boot/dts/nvidia/tegra132.dtsi      |  66 +--
> > > > > >  arch/arm64/boot/dts/nvidia/tegra210.dtsi      |  86 +--
> > > > > >  drivers/thermal/tegra/soctherm.c              | 525 ++++++++----------
> > > > > >  drivers/thermal/tegra/soctherm.h              |   1 +
> > > > > >  drivers/thermal/tegra/tegra124-soctherm.c     |   4 +
> > > > > >  drivers/thermal/tegra/tegra132-soctherm.c     |   4 +
> > > > > >  drivers/thermal/tegra/tegra210-soctherm.c     |   4 +
> > > > > >  drivers/thermal/thermal_core.h                |   2 +-
> > > > > >  drivers/thermal/thermal_of.c                  |   3 +
> > > > > >  11 files changed, 329 insertions(+), 453 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.42.0
> > > > > >
> > > > >
> > > > > I'm still experiencing the following message on jetson-tx1 with this
> > > > > serie applied on top of 6.6-rc5 (with iommu-next and tegra-next
> > > > > applied).
> > > > > oct. 13 10:53:16 jetson-tx1 kernel: max77620-thermal max77620-thermal:
> > > > > Failed to register thermal zone: -19
> > > >
> > > > Not sure about this one. I don't think I've seen it. Do you know if
> > > > that's somehow caused by this series, or is it preexisting?
> > >
> > > It's pre-existing from this serie.
> > >
> > > > > oct. 13 10:53:16 jetson-tx1 kernel: tegra_soctherm
> > > > > 700e2000.thermal-sensor: throttle-cfg: heavy: no throt prop or invalid
> > > > > prop
> > > > >
> > > > > Is this expected ?
> > > >
> > > > This one is definitely not expected. I have seen this before, and it
> > > > happens when the device tree doesn't contain all the properties that are
> > > > expected. Patch 12 in this series should take care of that. Have you
> > > > made sure that that's been applied and is what the kernel uses to boot
> > > > with?
> > >
> > > Yes, this dtb change in patch12 is propagated to the device (as seen
> > > in /boot/dtbs)
> > > But comparing with what's available at runtime in /proc/device-tree, I
> > > see some changes
> > >
> > >                         heavy {
> > > -                               hysteresis-millicelsius = <0xfa0>;
> > > +                               #cooling-cells = <0x02>;
> > >                                 nvidia,cpu-throt-percent = <0x55>;
> > >                                 nvidia,gpu-throt-level = <0x03>;
> > >                                 nvidia,priority = <0x64>;
> > > -                               nvidia,thermal-zones = <0x49 0x4a>;
> > > -                               temperature-millicelsius = <0x180c4>;
> > > +                               phandle = <0x130>;
> > >                         };
> >
> > Okay, that explains the error message.
> >
> > >
> > > I'm using u-boot 2023.07 with EFI boot (L4T 32.7.4).
> > > Could it be that the bootloader has changed these entries ? Can this
> > > be prevented ?
> >
> > I'm not aware of anything in the bootloader that would do this. Some
> > versions of U-Boot that ships with L4T can copy certain nodes in DTB but
> > I have never seen anything that would've touched thermal.
> >
> > Is it possible that you're not loading the DTB and end up receiving one
> > from UEFI or cboot?
> Seems like it: a previous copy was in /boot/dts that took over /boot/dtbs.
> With updated dtb, the warning disappeared.

Okay, great!

> Only remaining error is: kernel: max77620-thermal max77620-thermal:
> Failed to register thermal zone: -19

Looks like that's -ENODEV from devm_thermal_of_zone_register() via
max77620_thermal_probe(). Looking at thermal_of_zone_register(), the
-ENODEV case is treated specially, so perhaps we should be doing the
same thing in max77620_probe()?

Let me send out a patch.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/13] thermal: tegra: Do not register cooling device
  2023-10-12 17:58 ` [PATCH v2 06/13] thermal: tegra: Do not register cooling device Thierry Reding
@ 2023-10-13 15:57   ` Daniel Lezcano
  2023-11-10 13:55     ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Lezcano @ 2023-10-13 15:57 UTC (permalink / raw)
  To: Thierry Reding, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

On 12/10/2023 19:58, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The SOCTHERM's built-in throttling mechanism doesn't map well to the
> concept of a cooling device because it will automatically start to
> throttle when the programmed temperature threshold is crossed.
> 
> Remove the cooling device implementation and instead unconditionally
> program the throttling for the CPU and GPU thermal zones.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---

[ ... ]

> +	ret = of_property_read_u32(np, "temperature-millicelsius",
> +				   &stc->temperature);
> +	if (ret < 0)
> +		goto err;
> +
> +	ret = of_property_read_u32(np, "hysteresis-millicelsius",
> +				   &stc->hysteresis);
> +	if (ret < 0)
> +		goto err;
> +
> +	stc->num_zones = of_count_phandle_with_args(np, "nvidia,thermal-zones",
> +						    NULL);
> +	if (stc->num_zones > 0) {
> +		struct device_node *zone;
> +		unsigned int i;
> +
> +		stc->zones = devm_kcalloc(ts->dev, stc->num_zones, sizeof(zone),
> +					  GFP_KERNEL);
> +		if (!stc->zones)
> +			return -ENOMEM;
> +
> +		for (i = 0; i < stc->num_zones; i++) {
> +			zone = of_parse_phandle(np, "nvidia,thermal-zones", i);
> +			stc->zones[i] = zone;
> +		}
> +	}

What is the connection between the temperature sensor and the hardware 
limiter?

I mean, one hand there is the hardware limiter which is not connected to 
the sensor neither a thermal zone and it could be self contained in a 
separate driver. And then there is the temperature sensor.

The thermal zone phandle things connected with the throttling bindings 
sounds like strange to me.

What prevents to split the throttling and the sensor into separate code?

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property
  2023-10-12 17:58 ` [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property Thierry Reding
@ 2023-10-13 15:59   ` Daniel Lezcano
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Lezcano @ 2023-10-13 15:59 UTC (permalink / raw)
  To: Thierry Reding, Rafael J . Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm, devicetree,
	linux-tegra

On 12/10/2023 19:58, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> The throttle configurations need to be associated with one or more
> thermal zones before they can be enabled, so introduce a new property,
> called nvidia,thermal-zones, that contains a list of phandles to the
> thermal zones for which a throttle configuration is applicable.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v2:
> - new patch to hook up throttling with thermal zones
> 
>   .../bindings/thermal/nvidia,tegra124-soctherm.yaml           | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> index 0eb6277082fe..359344f60a6e 100644
> --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> @@ -161,6 +161,11 @@ properties:
>                 throttling is engaged after the OC event is deasserted.
>               default: 0
>   
> +          nvidia,thermal-zones:
> +            $ref: /schemas/types.yaml#/definitions/phandle
> +            description: A list of phandles to the thermal zones that this
> +              throttle configuration applies to.
> +

 From a DT perspective, I believe it is more correct to point to the 
devices the hardware throttling mechanism applies to instead of the 
thermal zones which is kind of software component

>     # optional
>     nvidia,thermtrips:
>       $ref: /schemas/types.yaml#/definitions/uint32-matrix

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature
  2023-10-12 17:58 ` [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature Thierry Reding
@ 2023-10-16 14:02   ` Rob Herring
  2023-11-10 13:58     ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Rob Herring @ 2023-10-16 14:02 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Daniel Lezcano, Rafael J . Wysocki, Krzysztof Kozlowski,
	Conor Dooley, Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm,
	devicetree, linux-tegra

On Thu, Oct 12, 2023 at 07:58:23PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Each throttling configuration needs to specify the temperature threshold
> at which it should start throttling. Previously this was tied to a given
> trip point as a cooling device and used the temperature specified for
> that trip point. This doesn't work well because the throttling mechanism
> is not a cooling device in the traditional sense.
> 
> Instead, allow device trees to specify the throttle temperature in the
> throttle configuration directly so that the throttle doesn't need to be
> exposed as a cooling device.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v2:
> - rename temperature to temperature-millicelsius and drop $ref
> - add hysteresis-millicelsius property
> 
>  .../bindings/thermal/nvidia,tegra124-soctherm.yaml | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> index 04a2ba1aa946..0eb6277082fe 100644
> --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> @@ -121,6 +121,20 @@ properties:
>                # high (85%, TEGRA_SOCTHERM_THROT_LEVEL_HIGH)
>                - 3
>  
> +          temperature-millicelsius:

'temperature' is redundant since we have units. Perhaps 
'throttle-millicelsius' or 'auto-throttle-millicelsius' instead to say 
what the temperature is for.

> +            minimum: -273000
> +            maximum: 200000

Quite impressive operating range.

> +            description: The temperature threshold (in millicelsius) that,
> +              when crossed, will trigger the configured automatic throttling.
> +
> +          hysteresis-millicelsius:
> +            description: An unsigned integer expressing the hysteresis delta
> +              (in millicelsius) with respect to the threshold temperature
> +              property above. Throttling will be initiated when the
> +              temperature falls below (temperature - hysteresis). This avoids
> +              situations where throttling is repeatedly initiated and stopped
> +              because of minor temperature variations.
> +
>            # optional
>            # Tegra210 specific and valid only for OCx throttle events
>            nvidia,count-threshold:
> -- 
> 2.42.0
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/13] thermal: tegra: Do not register cooling device
  2023-10-13 15:57   ` Daniel Lezcano
@ 2023-11-10 13:55     ` Thierry Reding
  2023-11-10 14:52       ` Thierry Reding
  0 siblings, 1 reply; 30+ messages in thread
From: Thierry Reding @ 2023-11-10 13:55 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J . Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm,
	devicetree, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 2968 bytes --]

On Fri, Oct 13, 2023 at 05:57:13PM +0200, Daniel Lezcano wrote:
> On 12/10/2023 19:58, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > The SOCTHERM's built-in throttling mechanism doesn't map well to the
> > concept of a cooling device because it will automatically start to
> > throttle when the programmed temperature threshold is crossed.
> > 
> > Remove the cooling device implementation and instead unconditionally
> > program the throttling for the CPU and GPU thermal zones.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> 
> [ ... ]
> 
> > +	ret = of_property_read_u32(np, "temperature-millicelsius",
> > +				   &stc->temperature);
> > +	if (ret < 0)
> > +		goto err;
> > +
> > +	ret = of_property_read_u32(np, "hysteresis-millicelsius",
> > +				   &stc->hysteresis);
> > +	if (ret < 0)
> > +		goto err;
> > +
> > +	stc->num_zones = of_count_phandle_with_args(np, "nvidia,thermal-zones",
> > +						    NULL);
> > +	if (stc->num_zones > 0) {
> > +		struct device_node *zone;
> > +		unsigned int i;
> > +
> > +		stc->zones = devm_kcalloc(ts->dev, stc->num_zones, sizeof(zone),
> > +					  GFP_KERNEL);
> > +		if (!stc->zones)
> > +			return -ENOMEM;
> > +
> > +		for (i = 0; i < stc->num_zones; i++) {
> > +			zone = of_parse_phandle(np, "nvidia,thermal-zones", i);
> > +			stc->zones[i] = zone;
> > +		}
> > +	}
> 
> What is the connection between the temperature sensor and the hardware
> limiter?
> 
> I mean, one hand there is the hardware limiter which is not connected to the
> sensor neither a thermal zone and it could be self contained in a separate
> driver. And then there is the temperature sensor.
> 
> The thermal zone phandle things connected with the throttling bindings
> sounds like strange to me.
> 
> What prevents to split the throttling and the sensor into separate code?

Both the temperature sensor and the hardware throttle mechanism are part
of the same IP block, so it would be quite difficult (and unnecessary)
to split them into separate drivers.

The hardware throttler uses the temperature sensor's data to initiate
throttling automatically when certain (programmable) temperature
thresholds are reached.

The reason why we need to reference the thermal zone is because the
registers needed to program the throttler are contained within the
sensor group (which are effectively mapped to thermal zones).

I suppose there are a number of other ways how this could be described.
The thermal zones could be extended with extra information about the
throttling, or we could use just the sensor group ID instead of a full
phandle to reference this.

I was sort of trying to keep things somewhat aligned with the concept of
thermal zones and not rewrite the entire thing, but perhaps I should go
back to the drawing board and think about whether there's an even better
way to describe this in DT.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature
  2023-10-16 14:02   ` Rob Herring
@ 2023-11-10 13:58     ` Thierry Reding
  0 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-11-10 13:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Daniel Lezcano, Rafael J . Wysocki, Krzysztof Kozlowski,
	Conor Dooley, Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm,
	devicetree, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 2089 bytes --]

On Mon, Oct 16, 2023 at 09:02:49AM -0500, Rob Herring wrote:
> On Thu, Oct 12, 2023 at 07:58:23PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Each throttling configuration needs to specify the temperature threshold
> > at which it should start throttling. Previously this was tied to a given
> > trip point as a cooling device and used the temperature specified for
> > that trip point. This doesn't work well because the throttling mechanism
> > is not a cooling device in the traditional sense.
> > 
> > Instead, allow device trees to specify the throttle temperature in the
> > throttle configuration directly so that the throttle doesn't need to be
> > exposed as a cooling device.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> > Changes in v2:
> > - rename temperature to temperature-millicelsius and drop $ref
> > - add hysteresis-millicelsius property
> > 
> >  .../bindings/thermal/nvidia,tegra124-soctherm.yaml | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> > index 04a2ba1aa946..0eb6277082fe 100644
> > --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> > +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.yaml
> > @@ -121,6 +121,20 @@ properties:
> >                # high (85%, TEGRA_SOCTHERM_THROT_LEVEL_HIGH)
> >                - 3
> >  
> > +          temperature-millicelsius:
> 
> 'temperature' is redundant since we have units. Perhaps 
> 'throttle-millicelsius' or 'auto-throttle-millicelsius' instead to say 
> what the temperature is for.

Okay, will do.

> 
> > +            minimum: -273000
> > +            maximum: 200000
> 
> Quite impressive operating range.

Yeah, I don't actually know where these come from. Looking at the manual
these sensors are actually limited to -127°C - 127°C. I'll fix that up.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/13] thermal: tegra: Do not register cooling device
  2023-11-10 13:55     ` Thierry Reding
@ 2023-11-10 14:52       ` Thierry Reding
  0 siblings, 0 replies; 30+ messages in thread
From: Thierry Reding @ 2023-11-10 14:52 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J . Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Amit Kucheria, Zhang Rui, Jon Hunter, linux-pm,
	devicetree, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 5508 bytes --]

On Fri, Nov 10, 2023 at 02:55:15PM +0100, Thierry Reding wrote:
> On Fri, Oct 13, 2023 at 05:57:13PM +0200, Daniel Lezcano wrote:
> > On 12/10/2023 19:58, Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > The SOCTHERM's built-in throttling mechanism doesn't map well to the
> > > concept of a cooling device because it will automatically start to
> > > throttle when the programmed temperature threshold is crossed.
> > > 
> > > Remove the cooling device implementation and instead unconditionally
> > > program the throttling for the CPU and GPU thermal zones.
> > > 
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > 
> > [ ... ]
> > 
> > > +	ret = of_property_read_u32(np, "temperature-millicelsius",
> > > +				   &stc->temperature);
> > > +	if (ret < 0)
> > > +		goto err;
> > > +
> > > +	ret = of_property_read_u32(np, "hysteresis-millicelsius",
> > > +				   &stc->hysteresis);
> > > +	if (ret < 0)
> > > +		goto err;
> > > +
> > > +	stc->num_zones = of_count_phandle_with_args(np, "nvidia,thermal-zones",
> > > +						    NULL);
> > > +	if (stc->num_zones > 0) {
> > > +		struct device_node *zone;
> > > +		unsigned int i;
> > > +
> > > +		stc->zones = devm_kcalloc(ts->dev, stc->num_zones, sizeof(zone),
> > > +					  GFP_KERNEL);
> > > +		if (!stc->zones)
> > > +			return -ENOMEM;
> > > +
> > > +		for (i = 0; i < stc->num_zones; i++) {
> > > +			zone = of_parse_phandle(np, "nvidia,thermal-zones", i);
> > > +			stc->zones[i] = zone;
> > > +		}
> > > +	}
> > 
> > What is the connection between the temperature sensor and the hardware
> > limiter?
> > 
> > I mean, one hand there is the hardware limiter which is not connected to the
> > sensor neither a thermal zone and it could be self contained in a separate
> > driver. And then there is the temperature sensor.
> > 
> > The thermal zone phandle things connected with the throttling bindings
> > sounds like strange to me.
> > 
> > What prevents to split the throttling and the sensor into separate code?
> 
> Both the temperature sensor and the hardware throttle mechanism are part
> of the same IP block, so it would be quite difficult (and unnecessary)
> to split them into separate drivers.
> 
> The hardware throttler uses the temperature sensor's data to initiate
> throttling automatically when certain (programmable) temperature
> thresholds are reached.
> 
> The reason why we need to reference the thermal zone is because the
> registers needed to program the throttler are contained within the
> sensor group (which are effectively mapped to thermal zones).
> 
> I suppose there are a number of other ways how this could be described.
> The thermal zones could be extended with extra information about the
> throttling, or we could use just the sensor group ID instead of a full
> phandle to reference this.
> 
> I was sort of trying to keep things somewhat aligned with the concept of
> thermal zones and not rewrite the entire thing, but perhaps I should go
> back to the drawing board and think about whether there's an even better
> way to describe this in DT.

I've looked at the documentation in a bit more details and here's an
high-level overview of what SOCTHERM is.

We have four groups (CPU, GPU, MEM and PLLX), each of which can be
programmed at four different levels (each level is an identical set of
registers to program temperature thresholds, throttling and enable or
disable). For temperature thresholds an interrupt can be configured.

There's an additional "thermtrip" level, which only has a threshold
that, when reached, will cause an emergency, hardware-induced shutdown
of the system.

Any of the generic levels can be used in whatever way we want. The
convention currently is to program the thermal zone trip points using
level 0. So for each group we create a thermal zone and level 0 for each
of the zones is programmed with the low and high thresholds for a given
trip point.

Currently we also use levels 1 and 2 to program the "light" and "heavy"
throttling "indicators". These will in turn be used to generate outputs
to the actual throttling mechanisms (CPU-light, CPU-heavy, GPU-light and
GPU-heavy).

There are a few other things that can be done, but I don't fully
understand how they would be useful and I don't think they've ever been
used, so I'll skip those for now.

Given the above, the thermal zone trip points are fairly clear. They are
fine as they are implemented. For the throttling mechanism we could do
something that maps more explicitly to the above groups and levels
concepts, but I think that could easily conflict with the trip points
programming, so keeping with the current conventions seems good and
designing the device tree bindings accordingly would help avoid any
conflicts.

So I think keeping the throttle-cfgs node is a good fit. We don't really
need to establish a connection between the thermal zone and the throttle
mechanism, though. We can derive the level from the indicator (light or
heavy) and for the group we only need an ID. The reason why I proposed a
link to the thermal-zone is because that thermal zone contains that ID
already, but we could equally well just add an nvidia,group property or
something along those lines so we know which group to use rather than
try and get it from a thermal zone.

I'll revise the bindings to see if I can come up with something.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-11-10 14:52 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 17:58 [PATCH v2 00/13] thermal: tegra: Do not register cooling device Thierry Reding
2023-10-12 17:58 ` [PATCH v2 01/13] thermal: Store device tree node for thermal zone devices Thierry Reding
2023-10-12 17:58 ` [PATCH v2 02/13] dt-bindings: thermal: tegra: Document throttle temperature Thierry Reding
2023-10-16 14:02   ` Rob Herring
2023-11-10 13:58     ` Thierry Reding
2023-10-12 17:58 ` [PATCH v2 03/13] dt-bindings: thermal: tegra: Add nvidia,thermal-zones property Thierry Reding
2023-10-13 15:59   ` Daniel Lezcano
2023-10-12 17:58 ` [PATCH v2 04/13] thermal: tegra: Use driver-private data consistently Thierry Reding
2023-10-13  8:04   ` Daniel Lezcano
2023-10-13 11:40     ` Thierry Reding
2023-10-12 17:58 ` [PATCH v2 05/13] thermal: tegra: Constify SoC-specific data Thierry Reding
2023-10-12 17:58 ` [PATCH v2 06/13] thermal: tegra: Do not register cooling device Thierry Reding
2023-10-13 15:57   ` Daniel Lezcano
2023-11-10 13:55     ` Thierry Reding
2023-11-10 14:52       ` Thierry Reding
2023-10-12 17:58 ` [PATCH v2 07/13] thermal: tegra: Use unsigned int where appropriate Thierry Reding
2023-10-12 17:58 ` [PATCH v2 08/13] thermal: tegra: Avoid over-allocation of temporary array Thierry Reding
2023-10-12 17:58 ` [PATCH v2 09/13] thermal: tegra: Remove gratuitous error assignment Thierry Reding
2023-10-12 17:58 ` [PATCH v2 10/13] thermal: tegra: Minor stylistic cleanups Thierry Reding
2023-10-12 17:58 ` [PATCH v2 11/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
2023-10-12 17:58 ` [PATCH v2 11/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
2023-10-12 17:58 ` [PATCH v2 12/13] arm64: tegra: Rework SOCTHERM on Tegra132 and Tegra210 Thierry Reding
2023-10-12 17:58 ` [PATCH v2 12/13] ARM: tegra: Rework SOCTHERM on Tegra124 Thierry Reding
2023-10-12 17:58 ` [PATCH v2 13/13] thermal: Enforce self-encapsulation Thierry Reding
2023-10-13  9:14 ` [PATCH v2 00/13] thermal: tegra: Do not register cooling device Nicolas Chauvet
2023-10-13 11:43   ` Thierry Reding
2023-10-13 12:45     ` Nicolas Chauvet
2023-10-13 13:13       ` Thierry Reding
2023-10-13 13:55         ` Nicolas Chauvet
2023-10-13 15:45           ` Thierry Reding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).