[PATCH v2] interconnect: Use rt_mutex for icc_bw

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] interconnect: Use rt_mutex for icc_bw_lock
@ 2025-05-06 14:51 Mike Tipton
  2025-05-16 15:50 ` Georgi Djakov
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Tipton @ 2025-05-06 14:51 UTC (permalink / raw)
  To: Georgi Djakov
  Cc: linux-pm, linux-kernel, quic_aiquny, quic_okukatla, Mike Tipton

The icc_set_bw() function is often used in latency sensitive paths to
scale BW on a per-frame basis by high priority clients such as GPU and
display. However, there are many low priority clients of icc_set_bw() as
well. This can lead to priority inversion and unacceptable delays for
the high priority clients. Which in the case of GPU and display can
result in frame drops and visual glitches.

To prevent this priority inversion, switch to using rt_mutex for
icc_bw_lock. This isn't needed for icc_lock since that's not used in the
critical, latency-sensitive voting paths.

Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
---

Since the original patch was posted a couple years ago, we've continued
to hit this for display and now for GPU as well. How frequently depends
heavily on the specific chip, product, and use case. Different
configurations hit it easier than others. But for both cases it results
in obvious visual glitches.

The paths being voted for (primarily DDR) are fundamentally shared
between clients of all types and priority levels. We can't control their
priorities, so aside from having those priorities inherited we're always
subject to these sorts of inversions.

The motivation isn't really for general performance improvement, but
instead to fix the rare cases of visual glitches and artifacts.

A similar patch was posted last year [1] to address similar problems.

[1] https://lore.kernel.org/all/20240220074300.10805-1-wangrumeng@xiaomi.corp-partner.google.com/

Changes in v2:
- Rebase onto linux-next.
- Select RT_MUTEXES in Kconfig.
- Only use rt_mutex for icc_bw_lock since now there are separate locks
  and icc_lock isn't in the critical path.
- Reword commit text.
- Link to v1: https://lore.kernel.org/all/20220906191423.30109-1-quic_mdtipton@quicinc.com/

 drivers/interconnect/Kconfig |  1 +
 drivers/interconnect/core.c  | 23 ++++++++++++-----------
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/interconnect/Kconfig b/drivers/interconnect/Kconfig
index f2e49bd97d31..f6fd5f2d7d40 100644
--- a/drivers/interconnect/Kconfig
+++ b/drivers/interconnect/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig INTERCONNECT
 	bool "On-Chip Interconnect management support"
+	select RT_MUTEXES
 	help
 	  Support for management of the on-chip interconnects.
 
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 1a41e59c77f8..2e86a3c95d1a 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -14,6 +14,7 @@
 #include <linux/interconnect-provider.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
+#include <linux/rtmutex.h>
 #include <linux/slab.h>
 #include <linux/of.h>
 #include <linux/overflow.h>
@@ -30,7 +31,7 @@ static LIST_HEAD(icc_providers);
 static int providers_count;
 static bool synced_state;
 static DEFINE_MUTEX(icc_lock);
-static DEFINE_MUTEX(icc_bw_lock);
+static DEFINE_RT_MUTEX(icc_bw_lock);
 static struct dentry *icc_debugfs_dir;
 
 static void icc_summary_show_one(struct seq_file *s, struct icc_node *n)
@@ -178,7 +179,7 @@ static struct icc_path *path_init(struct device *dev, struct icc_node *dst,
 
 	path->num_nodes = num_nodes;
 
-	mutex_lock(&icc_bw_lock);
+	rt_mutex_lock(&icc_bw_lock);
 
 	for (i = num_nodes - 1; i >= 0; i--) {
 		node->provider->users++;
@@ -190,7 +191,7 @@ static struct icc_path *path_init(struct device *dev, struct icc_node *dst,
 		node = node->reverse;
 	}
 
-	mutex_unlock(&icc_bw_lock);
+	rt_mutex_unlock(&icc_bw_lock);
 
 	return path;
 }
@@ -704,7 +705,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
 	if (WARN_ON(IS_ERR(path) || !path->num_nodes))
 		return -EINVAL;
 
-	mutex_lock(&icc_bw_lock);
+	rt_mutex_lock(&icc_bw_lock);
 
 	old_avg = path->reqs[0].avg_bw;
 	old_peak = path->reqs[0].peak_bw;
@@ -736,7 +737,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
 		apply_constraints(path);
 	}
 
-	mutex_unlock(&icc_bw_lock);
+	rt_mutex_unlock(&icc_bw_lock);
 
 	trace_icc_set_bw_end(path, ret);
 
@@ -798,7 +799,7 @@ void icc_put(struct icc_path *path)
 		pr_err("%s: error (%d)\n", __func__, ret);
 
 	mutex_lock(&icc_lock);
-	mutex_lock(&icc_bw_lock);
+	rt_mutex_lock(&icc_bw_lock);
 
 	for (i = 0; i < path->num_nodes; i++) {
 		node = path->reqs[i].node;
@@ -807,7 +808,7 @@ void icc_put(struct icc_path *path)
 			node->provider->users--;
 	}
 
-	mutex_unlock(&icc_bw_lock);
+	rt_mutex_unlock(&icc_bw_lock);
 	mutex_unlock(&icc_lock);
 
 	kfree(path->name);
@@ -1023,7 +1024,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 		return;
 
 	mutex_lock(&icc_lock);
-	mutex_lock(&icc_bw_lock);
+	rt_mutex_lock(&icc_bw_lock);
 
 	node->provider = provider;
 	list_add_tail(&node->node_list, &provider->nodes);
@@ -1056,7 +1057,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 	node->avg_bw = 0;
 	node->peak_bw = 0;
 
-	mutex_unlock(&icc_bw_lock);
+	rt_mutex_unlock(&icc_bw_lock);
 	mutex_unlock(&icc_lock);
 }
 EXPORT_SYMBOL_GPL(icc_node_add);
@@ -1182,7 +1183,7 @@ void icc_sync_state(struct device *dev)
 		return;
 
 	mutex_lock(&icc_lock);
-	mutex_lock(&icc_bw_lock);
+	rt_mutex_lock(&icc_bw_lock);
 	synced_state = true;
 	list_for_each_entry(p, &icc_providers, provider_list) {
 		dev_dbg(p->dev, "interconnect provider is in synced state\n");
@@ -1195,7 +1196,7 @@ void icc_sync_state(struct device *dev)
 			}
 		}
 	}
-	mutex_unlock(&icc_bw_lock);
+	rt_mutex_unlock(&icc_bw_lock);
 	mutex_unlock(&icc_lock);
 }
 EXPORT_SYMBOL_GPL(icc_sync_state);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] interconnect: Use rt_mutex for icc_bw_lock
  2025-05-06 14:51 [PATCH v2] interconnect: Use rt_mutex for icc_bw_lock Mike Tipton
@ 2025-05-16 15:50 ` Georgi Djakov
  2025-07-10 12:51   ` Aiqun(Maria) Yu
  0 siblings, 1 reply; 3+ messages in thread
From: Georgi Djakov @ 2025-05-16 15:50 UTC (permalink / raw)
  To: Mike Tipton; +Cc: linux-pm, linux-kernel, quic_aiquny, quic_okukatla

Hi Mike,

On 6.05.25 17:51, Mike Tipton wrote:
> The icc_set_bw() function is often used in latency sensitive paths to
> scale BW on a per-frame basis by high priority clients such as GPU and
> display. However, there are many low priority clients of icc_set_bw() as
> well. This can lead to priority inversion and unacceptable delays for
> the high priority clients. Which in the case of GPU and display can
> result in frame drops and visual glitches.

Ok, so the issue we see is caused by lock contention, as we have many
clients and some of them try to do very aggressive scaling.

> To prevent this priority inversion, switch to using rt_mutex for
> icc_bw_lock. This isn't needed for icc_lock since that's not used in the
> critical, latency-sensitive voting paths.

If the issue does not occur anymore with this patch, then this is a good
sign, but we still need to get some numbers and put them in the commit
message. The RT mutexes add some overhead and complexity that could
increase latency for both uncontended and contended paths. I am curious
if there is any regression for the non-priority scenarios. Also if there
are many threads, the mutex cost itself could become a bottleneck.

> 
> Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
> ---
> 
> Since the original patch was posted a couple years ago, we've continued
> to hit this for display and now for GPU as well. How frequently depends
> heavily on the specific chip, product, and use case. Different
> configurations hit it easier than others. But for both cases it results
> in obvious visual glitches.
> 
> The paths being voted for (primarily DDR) are fundamentally shared
> between clients of all types and priority levels. We can't control their
> priorities, so aside from having those priorities inherited we're always
> subject to these sorts of inversions.
> 
> The motivation isn't really for general performance improvement, but
> instead to fix the rare cases of visual glitches and artifacts.
> 
> A similar patch was posted last year [1] to address similar problems.
> 
> [1] https://lore.kernel.org/all/20240220074300.10805-1-wangrumeng@xiaomi.corp-partner.google.com/
> 
> Changes in v2:
> - Rebase onto linux-next.
> - Select RT_MUTEXES in Kconfig.
> - Only use rt_mutex for icc_bw_lock since now there are separate locks
>    and icc_lock isn't in the critical path.
> - Reword commit text.
> - Link to v1: https://lore.kernel.org/all/20220906191423.30109-1-quic_mdtipton@quicinc.com/
> 
>   drivers/interconnect/Kconfig |  1 +
>   drivers/interconnect/core.c  | 23 ++++++++++++-----------
>   2 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/interconnect/Kconfig b/drivers/interconnect/Kconfig
> index f2e49bd97d31..f6fd5f2d7d40 100644
> --- a/drivers/interconnect/Kconfig
> +++ b/drivers/interconnect/Kconfig
> @@ -1,6 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0-only
>   menuconfig INTERCONNECT
>   	bool "On-Chip Interconnect management support"
> +	select RT_MUTEXES

This pulls in unconditionally all the RT-mutex stuff, which some people
might not want (although today it's also selected by the I2C subsystem
for example). I am wondering if we should make it configurable with the
normal mutex being the default or just follow the i2c example... but
maybe we can decide this when we have some numbers.

Thanks,
Georgi

>   	help
>   	  Support for management of the on-chip interconnects.
>   
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index 1a41e59c77f8..2e86a3c95d1a 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -14,6 +14,7 @@
>   #include <linux/interconnect-provider.h>
>   #include <linux/list.h>
>   #include <linux/mutex.h>
> +#include <linux/rtmutex.h>
>   #include <linux/slab.h>
>   #include <linux/of.h>
>   #include <linux/overflow.h>
> @@ -30,7 +31,7 @@ static LIST_HEAD(icc_providers);
>   static int providers_count;
>   static bool synced_state;
>   static DEFINE_MUTEX(icc_lock);
> -static DEFINE_MUTEX(icc_bw_lock);
> +static DEFINE_RT_MUTEX(icc_bw_lock);
>   static struct dentry *icc_debugfs_dir;
>   
>   static void icc_summary_show_one(struct seq_file *s, struct icc_node *n)
> @@ -178,7 +179,7 @@ static struct icc_path *path_init(struct device *dev, struct icc_node *dst,
>   
>   	path->num_nodes = num_nodes;
>   
> -	mutex_lock(&icc_bw_lock);
> +	rt_mutex_lock(&icc_bw_lock);
>   
>   	for (i = num_nodes - 1; i >= 0; i--) {
>   		node->provider->users++;
> @@ -190,7 +191,7 @@ static struct icc_path *path_init(struct device *dev, struct icc_node *dst,
>   		node = node->reverse;
>   	}
>   
> -	mutex_unlock(&icc_bw_lock);
> +	rt_mutex_unlock(&icc_bw_lock);
>   
>   	return path;
>   }
> @@ -704,7 +705,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
>   	if (WARN_ON(IS_ERR(path) || !path->num_nodes))
>   		return -EINVAL;
>   
> -	mutex_lock(&icc_bw_lock);
> +	rt_mutex_lock(&icc_bw_lock);
>   
>   	old_avg = path->reqs[0].avg_bw;
>   	old_peak = path->reqs[0].peak_bw;
> @@ -736,7 +737,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
>   		apply_constraints(path);
>   	}
>   
> -	mutex_unlock(&icc_bw_lock);
> +	rt_mutex_unlock(&icc_bw_lock);
>   
>   	trace_icc_set_bw_end(path, ret);
>   
> @@ -798,7 +799,7 @@ void icc_put(struct icc_path *path)
>   		pr_err("%s: error (%d)\n", __func__, ret);
>   
>   	mutex_lock(&icc_lock);
> -	mutex_lock(&icc_bw_lock);
> +	rt_mutex_lock(&icc_bw_lock);
>   
>   	for (i = 0; i < path->num_nodes; i++) {
>   		node = path->reqs[i].node;
> @@ -807,7 +808,7 @@ void icc_put(struct icc_path *path)
>   			node->provider->users--;
>   	}
>   
> -	mutex_unlock(&icc_bw_lock);
> +	rt_mutex_unlock(&icc_bw_lock);
>   	mutex_unlock(&icc_lock);
>   
>   	kfree(path->name);
> @@ -1023,7 +1024,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
>   		return;
>   
>   	mutex_lock(&icc_lock);
> -	mutex_lock(&icc_bw_lock);
> +	rt_mutex_lock(&icc_bw_lock);
>   
>   	node->provider = provider;
>   	list_add_tail(&node->node_list, &provider->nodes);
> @@ -1056,7 +1057,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
>   	node->avg_bw = 0;
>   	node->peak_bw = 0;
>   
> -	mutex_unlock(&icc_bw_lock);
> +	rt_mutex_unlock(&icc_bw_lock);
>   	mutex_unlock(&icc_lock);
>   }
>   EXPORT_SYMBOL_GPL(icc_node_add);
> @@ -1182,7 +1183,7 @@ void icc_sync_state(struct device *dev)
>   		return;
>   
>   	mutex_lock(&icc_lock);
> -	mutex_lock(&icc_bw_lock);
> +	rt_mutex_lock(&icc_bw_lock);
>   	synced_state = true;
>   	list_for_each_entry(p, &icc_providers, provider_list) {
>   		dev_dbg(p->dev, "interconnect provider is in synced state\n");
> @@ -1195,7 +1196,7 @@ void icc_sync_state(struct device *dev)
>   			}
>   		}
>   	}
> -	mutex_unlock(&icc_bw_lock);
> +	rt_mutex_unlock(&icc_bw_lock);
>   	mutex_unlock(&icc_lock);
>   }
>   EXPORT_SYMBOL_GPL(icc_sync_state);


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] interconnect: Use rt_mutex for icc_bw_lock
  2025-05-16 15:50 ` Georgi Djakov
@ 2025-07-10 12:51   ` Aiqun(Maria) Yu
  0 siblings, 0 replies; 3+ messages in thread
From: Aiqun(Maria) Yu @ 2025-07-10 12:51 UTC (permalink / raw)
  To: Georgi Djakov, Mike Tipton; +Cc: linux-pm, linux-kernel, quic_okukatla

On 5/16/2025 11:50 PM, Georgi Djakov wrote:
> Hi Mike,
...
>> result in frame drops and visual glitches.
> 
> Ok, so the issue we see is caused by lock contention, as we have many
> clients and some of them try to do very aggressive scaling.
> 
>> To prevent this priority inversion, switch to using rt_mutex for
>> icc_bw_lock. This isn't needed for icc_lock since that's not used in the
>> critical, latency-sensitive voting paths.
> 
> If the issue does not occur anymore with this patch, then this is a good
> sign, but we still need to get some numbers and put them in the commit
> message. The RT mutexes add some overhead and complexity that could
I have some preliminary latency numbers for the icc_lock mutex lock on
my Android phone under normal conditions, ranging from 50 to 1000
nanoseconds. I observed that three normal priority tasks and one
real-time (RT) task are contending for the icc_lock. The latency numbers
are not differentiated between RT and normal tasks, but the 1000ns
latency was observed on the RT task.

The latency numbers can vary significantly depending on the scenario.
Please feel free to suggest any specific testing scenarios to capture
the numbers you are interested in.

The delay numbers will be based on the scheduler's granular time. For
instance, with a 250Hz scheduler tick, single cpu case, the delay is
likely to be around 4ms granular per sched_tick and the other system
tasks's vruntime conditions. Since both real-time (RT) tasks and normal
tasks may compete for this particular mutex lock, it is advisable to use
an rt_mutex to enhance real-time performance.

Here is the potential flow for better understanding:
   +--------------+           +-----------------+


   | RT Task A  |           |Normal cfs task B|


   +--------------+           +-----------------+


                             mutex_lock(&icc_lock)

                             Runnable because of other high prio normal
tasks

                             4ms sched_tick to check chance to run


   call icc_set_bw()


   mutex_lock(&icc_lock)





                            Get the chance to run


                             -->mutex_unlock(&icc_lock)


                             -->deboost task_B prio


   get the lock

> increase latency for both uncontended and contended paths. I am curious

Yes, there will be some overhead. However, if we use an RT thread to
speed up the icc_lock mutex, and if the clock settings can benefit the
entire system, it could be advantageous. For example, increasing the
clock speed could lead to an overall performance boost. In theory, this
approach is worth considering.
> if there is any regression for the non-priority scenarios. Also if there
> are many threads, the mutex cost itself could become a bottleneck.
> 
>>
>> Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>

-- 
Thx and BRs,
Aiqun(Maria) Yu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-10 12:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 14:51 [PATCH v2] interconnect: Use rt_mutex for icc_bw_lock Mike Tipton
2025-05-16 15:50 ` Georgi Djakov
2025-07-10 12:51   ` Aiqun(Maria) Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).