public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Linux ACPI <linux-acpi@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	Michal Wilczynski <michal.wilczynski@intel.com>,
	Zhang Rui <rui.zhang@intel.com>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Subject: Re: [PATCH v5 05/11] ACPI: thermal: Carry out trip point updates under zone lock
Date: Thu, 17 Aug 2023 15:09:34 +0200	[thread overview]
Message-ID: <3262036.aeNJFYEL58@kreacher> (raw)
In-Reply-To: <c53f99db-353a-26c3-3b0a-3a3befbed528@linaro.org>

On Wednesday, August 16, 2023 6:25:30 PM CEST Daniel Lezcano wrote:
> On 07/08/2023 20:08, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > There is a race condition between acpi_thermal_trips_update() and
> > acpi_thermal_check_fn(), because the trip points may get updated while
> > the latter is running which in theory may lead to inconsistent results.
> > For example, if two trips are updated together, using the temperature
> > value of one of them from before the update and the temperature value
> > of the other one from after the update may not lead to the expected
> > outcome.
> > 
> > Moreover, if thermal_get_trend() runs when a trip points update is in
> > progress, it may end up using stale trip point temperatures.
> > 
> > To address this, make acpi_thermal_trips_update() call
> > thermal_zone_device_adjust() to carry out the trip points update and
> > provide a new  acpi_thermal_adjust_thermal_zone() wrapper around
> > __acpi_thermal_trips_update() as the callback function for the latter.
> > 
> > While at it, change the acpi_thermal_trips_update() return data type
> > to void as that function always returns 0 anyway.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> 
> [ ... ]
> 
> >   {
> > -	int i, ret = acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT);
> >   	bool valid;
> > +	int i;
> >   
> > -	if (ret)
> > -		return ret;
> > +	__acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT);
> >   
> >   	valid = tz->trips.critical.valid |
> >   		tz->trips.hot.valid |
> > @@ -710,6 +732,7 @@ static struct thermal_zone_device_ops ac
> >   	.get_trend = thermal_get_trend,
> >   	.hot = acpi_thermal_zone_device_hot,
> >   	.critical = acpi_thermal_zone_device_critical,
> > +	.update = acpi_thermal_adjust_thermal_zone,
> 
> It is too bad we have to add a callback in the core code just for this 
> driver.
> 
> I'm wondering if it is not possible to get rid of it ?

Well, it is possible to pass the callback as an argument to the function running it.

The code is slightly simpler this way, so I think I'm going to do that.

Please see the appended replacement for patch [02/11].

Of course, it also is possible to provide accessors for acquiring and releasing
the zone lock, which would be more straightforward still (as mentioned before),
but I kind of understand the concerns regarding possible abuse of those by
drivers.

> Is it possible to use an internal lock for the ACPI driver to solve the 
> race issue above ?

No, it is not, and I have already explained it at least once, but let me do
that once again.

There are three code paths that need to be synchronized, because each of them
can run in parallel with any of the other two.

(a) acpi_thermal_trips_update() called via acpi_thermal_notify() which runs
    in the ACPI notify kworker context.
(b) thermal_get_trend(), already called under the zone lock by the core.
(c) acpi_thermal_check_fn() running in a kworker context, which calls
    thermal_zone_device_update() which it turn takes the zone lock.

Also the trip points update should not race with any computations using trip
point temperatures in the core or in the governors (they are carried out under
the zone lock as a rule).

(b) means that the local lock would need to be always taken under the zone
lock and then either acpi_thermal_check_fn() would need to be able to take
the local lock under the zone lock (so it would need to be able to acquire
the zone lock more or less directly), or acpi_thermal_trips_update() can
use the zone lock (which happens in the $subject patch via the new helper
function).

Moreover, using a local lock in acpi_thermal_trips_update() does not provide
any protection for the code using trip temperatures that runs under the zone
lock mentioned above.

So as I said, the patch below replaces [02/11] and it avoids adding a new
callback to zone operations.  The code gets slightly simpler with [02/11]
replaced with the appended one, so I'm going to use the latter.

It requires the $subject patch and patch [11/11] to be rebased, but that
is so trivial that I'm not even going to send updates of these patches.

The current series is available in the acpi-thermal git branch in
linux-pm.git.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] thermal: core: Introduce thermal_zone_device_exec()

Introduce a new helper function, thermal_zone_device_exec(), that can
be used by drivers to run a given callback routine under the zone lock.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c |   19 +++++++++++++++++++
 include/linux/thermal.h        |    4 ++++
 2 files changed, 23 insertions(+)

Index: linux-pm/include/linux/thermal.h
===================================================================
--- linux-pm.orig/include/linux/thermal.h
+++ linux-pm/include/linux/thermal.h
@@ -323,6 +323,10 @@ int thermal_zone_unbind_cooling_device(s
 				       struct thermal_cooling_device *);
 void thermal_zone_device_update(struct thermal_zone_device *,
 				enum thermal_notify_event);
+void thermal_zone_device_exec(struct thermal_zone_device *tz,
+			      void (*cb)(struct thermal_zone_device *,
+					 unsigned long),
+			      unsigned long data);
 
 struct thermal_cooling_device *thermal_cooling_device_register(const char *,
 		void *, const struct thermal_cooling_device_ops *);
Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -497,6 +497,25 @@ void thermal_zone_device_update(struct t
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
+/**
+ * thermal_zone_device_exec - Run a callback under the zone lock.
+ * @tz: Thermal zone.
+ * @cb: Callback to run.
+ * @data: Data to pass to the callback.
+ */
+void thermal_zone_device_exec(struct thermal_zone_device *tz,
+				void (*cb)(struct thermal_zone_device *,
+					   unsigned long),
+				unsigned long data)
+{
+	mutex_lock(&tz->lock);
+
+	cb(tz, data);
+
+	mutex_unlock(&tz->lock);
+}
+EXPORT_SYMBOL_GPL(thermal_zone_device_exec);
+
 static void thermal_zone_device_check(struct work_struct *work)
 {
 	struct thermal_zone_device *tz = container_of(work, struct




  reply	other threads:[~2023-08-17 13:11 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18 18:01 [PATCH v1 0/7] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-07-18 18:02 ` [PATCH v1 1/7] thermal: core: Add mechanism for connecting trips with driver data Rafael J. Wysocki
2023-07-18 18:04 ` [PATCH v1 2/7] thermal: core: Do not handle trip points with invalid temperature Rafael J. Wysocki
2023-07-18 18:05 ` [PATCH v1 3/7] ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() Rafael J. Wysocki
2023-07-18 18:06 ` [PATCH v1 4/7] ACPI: thermal: Hold thermal_check_lock around trip updates Rafael J. Wysocki
2023-07-18 18:06 ` [PATCH v1 5/7] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-07-19 18:46   ` Rafael J. Wysocki
2023-07-18 18:07 ` [PATCH v1 6/7] ACPI: thermal: Rework thermal_get_trend() Rafael J. Wysocki
2023-07-19 18:50   ` Rafael J. Wysocki
2023-07-18 18:19 ` [PATCH v1 7/7] ACPI: thermal: Drop unnecessary thermal zone callbacks Rafael J. Wysocki
2023-07-18 22:45 ` [PATCH v1 0/7] ACPI: thermal: Use trip point table to register thermal zones Daniel Lezcano
2023-07-19 13:33   ` Rafael J. Wysocki
2023-07-21 12:44 ` [PATCH v2 0/8] " Rafael J. Wysocki
2023-07-21 12:46   ` [PATCH v2 1/8] thermal: core: Add mechanism for connecting trips with driver data Rafael J. Wysocki
2023-07-21 12:47   ` [PATCH v2 2/8] thermal: core: Do not handle trip points with invalid temperature Rafael J. Wysocki
2023-07-21 12:50   ` [PATCH v2 3/8] thermal: core: Add routines for locking and unlocking thermal zones Rafael J. Wysocki
2023-07-21 14:12   ` [PATCH v2 4/8] ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() Rafael J. Wysocki
2023-07-21 14:51   ` [PATCH v2 5/8] ACPI: thermal: Hold thermal zone lock around trip updates Rafael J. Wysocki
2023-07-21 15:00   ` [PATCH v2 6/8] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-07-21 15:08   ` [PATCH v2 7/8] ACPI: thermal: Rework thermal_get_trend() Rafael J. Wysocki
2023-07-21 15:10   ` [PATCH v2 8/8] ACPI: thermal: Drop unnecessary thermal zone callbacks Rafael J. Wysocki
2023-07-23 10:19   ` [PATCH v2 0/8] ACPI: thermal: Use trip point table to register thermal zones Daniel Lezcano
2023-07-24  7:59     ` Rafael J. Wysocki
2023-07-25 12:02 ` [PATCH v3 " Rafael J. Wysocki
2023-07-25 12:04   ` [PATCH v3 1/8] thermal: core: Add mechanism for connecting trips with driver data Rafael J. Wysocki
2023-08-01 18:29     ` Daniel Lezcano
2023-08-01 19:02       ` Rafael J. Wysocki
2023-08-02 12:34         ` Daniel Lezcano
2023-08-02 13:03           ` Rafael J. Wysocki
2023-08-02 15:50             ` Daniel Lezcano
2023-08-02 16:48               ` Rafael J. Wysocki
2023-08-03 13:06                 ` Daniel Lezcano
2023-08-03 14:15                   ` Rafael J. Wysocki
2023-08-03 16:20                     ` Daniel Lezcano
2023-08-03 19:58                       ` Rafael J. Wysocki
2023-08-04  8:17                         ` Daniel Lezcano
2023-08-04 11:14                           ` Rafael J. Wysocki
2023-07-25 12:06   ` [PATCH v3 2/8] thermal: core: Do not handle trip points with invalid temperature Rafael J. Wysocki
2023-08-01 18:29     ` Daniel Lezcano
2023-08-01 19:05       ` Rafael J. Wysocki
2023-08-01 19:31         ` Daniel Lezcano
2023-07-25 12:08   ` [PATCH v3 3/8] thermal: core: Add routines for locking and unlocking thermal zones Rafael J. Wysocki
2023-08-01 18:30     ` Daniel Lezcano
2023-08-01 19:11       ` Rafael J. Wysocki
2023-07-25 12:09   ` [PATCH v3 4/8] ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() Rafael J. Wysocki
2023-07-25 12:16   ` [PATCH v3 5/8] ACPI: thermal: Hold thermal zone lock around trip updates Rafael J. Wysocki
2023-08-01 18:39     ` Daniel Lezcano
2023-08-01 18:51       ` Rafael J. Wysocki
2023-07-25 12:20   ` [PATCH v3 6/8] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-07-25 18:08     ` Rafael J. Wysocki
2023-07-25 18:12     ` [PATCH v3.1 " Rafael J. Wysocki
2023-07-25 12:22   ` [PATCH v3 7/8] ACPI: thermal: Rework thermal_get_trend() Rafael J. Wysocki
2023-07-25 12:24   ` [PATCH v3 8/8] ACPI: thermal: Drop unnecessary thermal zone callbacks Rafael J. Wysocki
2023-08-01 18:26   ` [PATCH v3 0/8] ACPI: thermal: Use trip point table to register thermal zones Daniel Lezcano
2023-08-01 18:39     ` Rafael J. Wysocki
2023-08-04 20:58 ` [PATCH v4 00/10] " Rafael J. Wysocki
2023-08-04 21:00   ` [PATCH v4 01/10] thermal: core: Do not handle trip points with invalid temperature Rafael J. Wysocki
2023-08-07 11:25     ` Daniel Lezcano
2023-08-04 21:03   ` [PATCH v4 02/10] thermal: core: Introduce thermal_zone_device_adjust() Rafael J. Wysocki
2023-08-04 21:04   ` [PATCH v4 03/10] thermal: core: Add priv pointer to struct thermal_trip Rafael J. Wysocki
2023-08-07 11:25     ` Daniel Lezcano
2023-08-04 21:05   ` [PATCH v4 04/10] thermal: core: Add thermal_zone_update_trip_temp() helper routine Rafael J. Wysocki
2023-08-07 11:29     ` Daniel Lezcano
2023-08-07 15:40       ` Rafael J. Wysocki
2023-08-07 16:17         ` Daniel Lezcano
2023-08-07 16:23           ` Rafael J. Wysocki
2023-08-04 21:07   ` [PATCH v4 05/10] ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() Rafael J. Wysocki
2023-08-04 21:13   ` [PATCH v4 06/10] ACPI: thermal: Carry out trip point updates under zone lock Rafael J. Wysocki
2023-08-04 21:18   ` [PATCH v4 07/10] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-08-04 21:22   ` [PATCH v4 08/10] ACPI: thermal: Rework thermal_get_trend() Rafael J. Wysocki
2023-08-04 21:24   ` [PATCH v4 09/10] ACPI: thermal: Drop unnecessary thermal zone callbacks Rafael J. Wysocki
2023-08-04 21:25   ` [PATCH v4 10/10] thermal: core: Eliminate code duplication from acpi_thermal_notify() Rafael J. Wysocki
2023-08-07 17:59 ` [PATCH v5 00/11] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-08-07 18:01   ` [PATCH v5 01/11] thermal: core: Do not handle trip points with invalid temperature Rafael J. Wysocki
2023-08-07 18:02   ` [PATCH v5 02/11] thermal: core: Introduce thermal_zone_device_adjust() Rafael J. Wysocki
2023-08-07 18:04   ` [PATCH v5 03/11] thermal: core: Add priv pointer to struct thermal_trip Rafael J. Wysocki
2023-08-07 18:06   ` [PATCH v5 04/11] ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() Rafael J. Wysocki
2023-08-07 18:08   ` [PATCH v5 05/11] ACPI: thermal: Carry out trip point updates under zone lock Rafael J. Wysocki
2023-08-16 16:25     ` Daniel Lezcano
2023-08-17 13:09       ` Rafael J. Wysocki [this message]
2023-08-07 18:10   ` [PATCH v5 06/11] ACPI: thermal: Introduce struct acpi_thermal_trip Rafael J. Wysocki
2023-08-07 18:11   ` [PATCH v5 07/11] thermal: core: Rework and rename __for_each_thermal_trip() Rafael J. Wysocki
2023-08-07 18:14   ` [PATCH v5 08/11] ACPI: thermal: Use trip point table to register thermal zones Rafael J. Wysocki
2023-08-09 11:07     ` [PATCH v5.1 " Rafael J. Wysocki
2023-08-07 18:17   ` [PATCH v5 09/11] ACPI: thermal: Rework thermal_get_trend() Rafael J. Wysocki
2023-08-07 18:18   ` [PATCH v5 10/11] ACPI: thermal: Drop unnecessary thermal zone callbacks Rafael J. Wysocki
2023-08-07 18:20   ` [PATCH v5 11/11] thermal: core: Eliminate code duplication from acpi_thermal_notify() Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3262036.aeNJFYEL58@kreacher \
    --to=rjw@rjwysocki.net \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=michal.wilczynski@intel.com \
    --cc=rui.zhang@intel.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox