* [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
@ 2025-05-29 14:46 ` Gabor Juhos
2025-05-30 9:16 ` Bryan O'Donoghue
2025-06-18 12:43 ` Johan Hovold
0 siblings, 2 replies; 8+ messages in thread
From: Gabor Juhos @ 2025-05-29 14:46 UTC (permalink / raw)
To: Georgi Djakov, Raviteja Laggyshetty
Cc: linux-pm, linux-arm-msm, linux-kernel, Gabor Juhos
The 'icc_bw_lock' mutex is introduced in commit af42269c3523
("interconnect: Fix locking for runpm vs reclaim") in order
to decouple serialization of bw aggregation from codepaths
that require memory allocation.
However commit d30f83d278a9 ("interconnect: core: Add dynamic
id allocation support") added a devm_kasprintf() call into a
path protected by the 'icc_bw_lock' which causes this lockdep
warning (at least on the IPQ9574 platform):
======================================================
WARNING: possible circular locking dependency detected
6.15.0-next-20250529 #0 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108
but task is already holding lock:
ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x7c/0xb8
slab_alloc_node.isra.0+0x48/0x188
__kmalloc_node_track_caller_noprof+0xa4/0x2b8
devm_kmalloc+0x5c/0x138
devm_kvasprintf+0x6c/0xb8
devm_kasprintf+0x50/0x68
icc_node_add+0xbc/0x160
icc_clk_register+0x15c/0x230
devm_icc_clk_register+0x20/0x90
qcom_cc_really_probe+0x320/0x338
nss_cc_ipq9574_probe+0xac/0x1e8
platform_probe+0x70/0xd0
really_probe+0xdc/0x3b8
__driver_probe_device+0x94/0x178
driver_probe_device+0x48/0xf0
__driver_attach+0x13c/0x208
bus_for_each_dev+0x6c/0xb8
driver_attach+0x2c/0x40
bus_add_driver+0x100/0x250
driver_register+0x68/0x138
__platform_driver_register+0x2c/0x40
nss_cc_ipq9574_driver_init+0x24/0x38
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
-> #0 (icc_bw_lock){+.+.}-{4:4}:
__lock_acquire+0x1348/0x2090
lock_acquire+0x108/0x2d8
icc_init+0x50/0x108
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(icc_bw_lock);
lock(fs_reclaim);
lock(icc_bw_lock);
*** DEADLOCK ***
1 lock held by swapper/0/1:
#0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
stack backtrace:
CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE
Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
Call trace:
show_stack+0x20/0x38 (C)
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x28
print_circular_bug+0x334/0x448
check_noncircular+0x12c/0x140
__lock_acquire+0x1348/0x2090
lock_acquire+0x108/0x2d8
icc_init+0x50/0x108
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
Move the memory allocation part of the code outside of the protected
path to eliminate the warning. Also add a note about why it is moved
to there,
Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
drivers/interconnect/core.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
return;
mutex_lock(&icc_lock);
+
+ if (node->id >= ICC_DYN_ID_START) {
+ /*
+ * Memory allocation must be done outside of codepaths
+ * protected by icc_bw_lock.
+ */
+ node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
+ node->name, dev_name(provider->dev));
+ }
+
mutex_lock(&icc_bw_lock);
node->provider = provider;
@@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
node->avg_bw = node->init_avg;
node->peak_bw = node->init_peak;
- if (node->id >= ICC_DYN_ID_START)
- node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
- node->name, dev_name(provider->dev));
-
if (node->avg_bw || node->peak_bw) {
if (provider->pre_aggregate)
provider->pre_aggregate(node);
---
base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2
change-id: 20250529-icc-bw-lockdep-ed030d892a19
Best regards,
--
Gabor Juhos <j4g8y7@gmail.com>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos
@ 2025-05-30 9:16 ` Bryan O'Donoghue
2025-06-03 9:15 ` Gabor Juhos
2025-06-18 12:43 ` Johan Hovold
1 sibling, 1 reply; 8+ messages in thread
From: Bryan O'Donoghue @ 2025-05-30 9:16 UTC (permalink / raw)
To: Gabor Juhos, Georgi Djakov, Raviteja Laggyshetty
Cc: linux-pm, linux-arm-msm, linux-kernel
On 29/05/2025 15:46, Gabor Juhos wrote:
> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> ("interconnect: Fix locking for runpm vs reclaim") in order
> to decouple serialization of bw aggregation from codepaths
> that require memory allocation.
>
> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> id allocation support") added a devm_kasprintf() call into a
> path protected by the 'icc_bw_lock' which causes this lockdep
> warning (at least on the IPQ9574 platform):
Missing a Fixes tag.
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.15.0-next-20250529 #0 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108
>
> but task is already holding lock:
> ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (fs_reclaim){+.+.}-{0:0}:
> fs_reclaim_acquire+0x7c/0xb8
> slab_alloc_node.isra.0+0x48/0x188
> __kmalloc_node_track_caller_noprof+0xa4/0x2b8
> devm_kmalloc+0x5c/0x138
> devm_kvasprintf+0x6c/0xb8
> devm_kasprintf+0x50/0x68
> icc_node_add+0xbc/0x160
> icc_clk_register+0x15c/0x230
> devm_icc_clk_register+0x20/0x90
> qcom_cc_really_probe+0x320/0x338
> nss_cc_ipq9574_probe+0xac/0x1e8
> platform_probe+0x70/0xd0
> really_probe+0xdc/0x3b8
> __driver_probe_device+0x94/0x178
> driver_probe_device+0x48/0xf0
> __driver_attach+0x13c/0x208
> bus_for_each_dev+0x6c/0xb8
> driver_attach+0x2c/0x40
> bus_add_driver+0x100/0x250
> driver_register+0x68/0x138
> __platform_driver_register+0x2c/0x40
> nss_cc_ipq9574_driver_init+0x24/0x38
> do_one_initcall+0x88/0x340
> kernel_init_freeable+0x2ac/0x4f8
> kernel_init+0x28/0x1e8
> ret_from_fork+0x10/0x20
>
> -> #0 (icc_bw_lock){+.+.}-{4:4}:
> __lock_acquire+0x1348/0x2090
> lock_acquire+0x108/0x2d8
> icc_init+0x50/0x108
> do_one_initcall+0x88/0x340
> kernel_init_freeable+0x2ac/0x4f8
> kernel_init+0x28/0x1e8
> ret_from_fork+0x10/0x20
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(fs_reclaim);
> lock(icc_bw_lock);
> lock(fs_reclaim);
> lock(icc_bw_lock);
>
> *** DEADLOCK ***
>
> 1 lock held by swapper/0/1:
> #0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
>
> stack backtrace:
> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE
> Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
> Call trace:
> show_stack+0x20/0x38 (C)
> dump_stack_lvl+0x90/0xd0
> dump_stack+0x18/0x28
> print_circular_bug+0x334/0x448
> check_noncircular+0x12c/0x140
> __lock_acquire+0x1348/0x2090
> lock_acquire+0x108/0x2d8
> icc_init+0x50/0x108
> do_one_initcall+0x88/0x340
> kernel_init_freeable+0x2ac/0x4f8
> kernel_init+0x28/0x1e8
> ret_from_fork+0x10/0x20
>
> Move the memory allocation part of the code outside of the protected
> path to eliminate the warning. Also add a note about why it is moved
> to there,
>
> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> drivers/interconnect/core.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
> return;
>
> mutex_lock(&icc_lock);
> +
> + if (node->id >= ICC_DYN_ID_START) {
> + /*
> + * Memory allocation must be done outside of codepaths
> + * protected by icc_bw_lock.
> + */
> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> + node->name, dev_name(provider->dev));
> + }
> +
> mutex_lock(&icc_bw_lock);
>
> node->provider = provider;
> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
> node->avg_bw = node->init_avg;
> node->peak_bw = node->init_peak;
>
> - if (node->id >= ICC_DYN_ID_START)
> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> - node->name, dev_name(provider->dev));
> -
> if (node->avg_bw || node->peak_bw) {
> if (provider->pre_aggregate)
> provider->pre_aggregate(node);
>
> ---
> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2
> change-id: 20250529-icc-bw-lockdep-ed030d892a19
>
> Best regards,
> --
> Gabor Juhos <j4g8y7@gmail.com>
>
>
The locking in this code is a mess.
Which data-structures does icc_lock protect node* pointers I think and
which data-structures does icc_bw_lock protect - "bw" data structures ?
Hmm.
Looking at this code I'm not sure at all what icc_lock was introduced to do.
Can we not just drop it entirely ?
---
bod
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-05-30 9:16 ` Bryan O'Donoghue
@ 2025-06-03 9:15 ` Gabor Juhos
2025-06-03 10:01 ` Bryan O'Donoghue
0 siblings, 1 reply; 8+ messages in thread
From: Gabor Juhos @ 2025-06-03 9:15 UTC (permalink / raw)
To: Bryan O'Donoghue, Georgi Djakov, Raviteja Laggyshetty
Cc: linux-pm, linux-arm-msm, linux-kernel
Hello Bryan,
Sorry for the late reply, I missed your mail.
2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
> On 29/05/2025 15:46, Gabor Juhos wrote:
>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>> ("interconnect: Fix locking for runpm vs reclaim") in order
>> to decouple serialization of bw aggregation from codepaths
>> that require memory allocation.
>>
>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>> id allocation support") added a devm_kasprintf() call into a
>> path protected by the 'icc_bw_lock' which causes this lockdep
>> warning (at least on the IPQ9574 platform):
>
> Missing a Fixes tag.
Erm, it is before my s-o-b tag.
...
>> Move the memory allocation part of the code outside of the protected
>> path to eliminate the warning. Also add a note about why it is moved
>> to there,
>>
>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>> ---
>> drivers/interconnect/core.c | 14 ++++++++++----
>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>> index
>> 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
>> --- a/drivers/interconnect/core.c
>> +++ b/drivers/interconnect/core.c
>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
>> icc_provider *provider)
>> return;
>>
>> mutex_lock(&icc_lock);
>> +
>> + if (node->id >= ICC_DYN_ID_START) {
>> + /*
>> + * Memory allocation must be done outside of codepaths
>> + * protected by icc_bw_lock.
>> + */
>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>> + node->name, dev_name(provider->dev));
>> + }
>> +
>> mutex_lock(&icc_bw_lock);
>>
>> node->provider = provider;
>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
>> icc_provider *provider)
>> node->avg_bw = node->init_avg;
>> node->peak_bw = node->init_peak;
>>
>> - if (node->id >= ICC_DYN_ID_START)
>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>> - node->name, dev_name(provider->dev));
>> -
>> if (node->avg_bw || node->peak_bw) {
>> if (provider->pre_aggregate)
>> provider->pre_aggregate(node);
>>
>> ---
>> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2
>> change-id: 20250529-icc-bw-lockdep-ed030d892a19
>>
>> Best regards,
>> --
>> Gabor Juhos <j4g8y7@gmail.com>
>>
>>
>
> The locking in this code is a mess.
>
> Which data-structures does icc_lock protect node* pointers I think and which
> data-structures does icc_bw_lock protect - "bw" data structures ?
>
> Hmm.
>
> Looking at this code I'm not sure at all what icc_lock was introduced to do.
Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
The 'icc_bw_lock' has been introduced later by commit af42269c3523
("interconnect: Fix locking for runpm vs reclaim") as part of the
"drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
Here is the reason copied from the original commit message:
"For cases where icc_bw_set() can be called in callbaths that could
deadlock against shrinker/reclaim, such as runpm resume, we need to
decouple the icc locking. Introduce a new icc_bw_lock for cases where
we need to serialize bw aggregation and update to decouple that from
paths that require memory allocation such as node/link creation/
destruction."
> Can we not just drop it entirely ?
I'm not an expert in locking, but I doubt that we can easily drop any of the two
mutexes without reintroducing the problem fixed by the change mentioned above.
[1] https://lore.kernel.org/all/20230807171148.210181-1-robdclark@gmail.com/
Regards,
Gabor
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-06-03 9:15 ` Gabor Juhos
@ 2025-06-03 10:01 ` Bryan O'Donoghue
2025-06-18 12:50 ` Johan Hovold
0 siblings, 1 reply; 8+ messages in thread
From: Bryan O'Donoghue @ 2025-06-03 10:01 UTC (permalink / raw)
To: Gabor Juhos, Bryan O'Donoghue, Georgi Djakov,
Raviteja Laggyshetty
Cc: linux-pm, linux-arm-msm, linux-kernel
On 03/06/2025 10:15, Gabor Juhos wrote:
> Hello Bryan,
>
> Sorry for the late reply, I missed your mail.
>
> 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
>> On 29/05/2025 15:46, Gabor Juhos wrote:
>>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>>> ("interconnect: Fix locking for runpm vs reclaim") in order
>>> to decouple serialization of bw aggregation from codepaths
>>> that require memory allocation.
>>>
>>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>>> id allocation support") added a devm_kasprintf() call into a
>>> path protected by the 'icc_bw_lock' which causes this lockdep
>>> warning (at least on the IPQ9574 platform):
>>
>> Missing a Fixes tag.
>
> Erm, it is before my s-o-b tag.
Great thank you I see that.
> ...
>
>>> Move the memory allocation part of the code outside of the protected
>>> path to eliminate the warning. Also add a note about why it is moved
>>> to there,
>>>
>>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>>> ---
>>> drivers/interconnect/core.c | 14 ++++++++++----
>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>>> index
>>> 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
>>> --- a/drivers/interconnect/core.c
>>> +++ b/drivers/interconnect/core.c
>>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
>>> icc_provider *provider)
>>> return;
>>>
>>> mutex_lock(&icc_lock);
>>> +
>>> + if (node->id >= ICC_DYN_ID_START) {
>>> + /*
>>> + * Memory allocation must be done outside of codepaths
>>> + * protected by icc_bw_lock.
>>> + */
>>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>>> + node->name, dev_name(provider->dev));
>>> + }
>>> +
>>> mutex_lock(&icc_bw_lock);
>>>
>>> node->provider = provider;
>>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
>>> icc_provider *provider)
>>> node->avg_bw = node->init_avg;
>>> node->peak_bw = node->init_peak;
>>>
>>> - if (node->id >= ICC_DYN_ID_START)
>>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>>> - node->name, dev_name(provider->dev));
>>> -
>>> if (node->avg_bw || node->peak_bw) {
>>> if (provider->pre_aggregate)
>>> provider->pre_aggregate(node);
>>>
>>> ---
>>> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2
>>> change-id: 20250529-icc-bw-lockdep-ed030d892a19
>>>
>>> Best regards,
>>> --
>>> Gabor Juhos <j4g8y7@gmail.com>
>>>
>>>
>>
>> The locking in this code is a mess.
>>
>> Which data-structures does icc_lock protect node* pointers I think and which
>> data-structures does icc_bw_lock protect - "bw" data structures ?
>>
>> Hmm.
>>
>> Looking at this code I'm not sure at all what icc_lock was introduced to do.
>
> Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
> The 'icc_bw_lock' has been introduced later by commit af42269c3523
> ("interconnect: Fix locking for runpm vs reclaim") as part of the
> "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
>
> Here is the reason copied from the original commit message:
>
> "For cases where icc_bw_set() can be called in callbaths that could
> deadlock against shrinker/reclaim, such as runpm resume, we need to
> decouple the icc locking. Introduce a new icc_bw_lock for cases where
> we need to serialize bw aggregation and update to decouple that from
> paths that require memory allocation such as node/link creation/
> destruction."
Right but reading this code.
icc_set_bw();
icc_lock_bw - protects struct icc_node *
icc_put();
icc_lock - locks
icc_lock_bw -locks directly after protects struct icc_node *
icc_node_add current:
icc_lock - locks
icc_lock_bw - locks
node->name = devm_kasprintf();
After your change
icc_node_add current:
icc_lock - locks
node->name = devm_kasprintf();
icc_lock_bw - locks
owns node->provider - or whatever
And this is what is prompting my question. Which locks own which data here ?
I think we should sort that out, either by removing one of the locks or
by at the very least documenting beside the mutex declarations which
locks protect what.
---
bod
>> Can we not just drop it entirely ?
>
> I'm not an expert in locking, but I doubt that we can easily drop any of the two
> mutexes without reintroducing the problem fixed by the change mentioned above.
>
> [1] https://lore.kernel.org/all/20230807171148.210181-1-robdclark@gmail.com/
>
> Regards,
> Gabor
>
>
Right - if this were a struct we would declare what these individual
mutexes lock.
That's my question here, as I review this code, which mutex protects what ?
I don't think that is particularly clear.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos
2025-05-30 9:16 ` Bryan O'Donoghue
@ 2025-06-18 12:43 ` Johan Hovold
2025-06-18 19:38 ` Gabor Juhos
1 sibling, 1 reply; 8+ messages in thread
From: Johan Hovold @ 2025-06-18 12:43 UTC (permalink / raw)
To: Gabor Juhos, Georgi Djakov
Cc: Raviteja Laggyshetty, Bryan O'Donoghue, linux-pm,
linux-arm-msm, linux-kernel
On Thu, May 29, 2025 at 04:46:22PM +0200, Gabor Juhos wrote:
> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> ("interconnect: Fix locking for runpm vs reclaim") in order
> to decouple serialization of bw aggregation from codepaths
> that require memory allocation.
>
> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> id allocation support") added a devm_kasprintf() call into a
> path protected by the 'icc_bw_lock' which causes this lockdep
> warning (at least on the IPQ9574 platform):
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.15.0-next-20250529 #0 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108
>
> but task is already holding lock:
> ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
>
> which lock already depends on the new lock.
Thanks for fixing this. I get a similar splat with sc8280xp and the
icc_ism_l3 driver since 6.16-rc1.
Georgi, this is a regression that prevents lockdep from being used on a
bunch of Qualcomm platforms and should be fixed in mainline ASAP (e.g.
to avoid further locking issues from being introduced).
> Move the memory allocation part of the code outside of the protected
> path to eliminate the warning. Also add a note about why it is moved
> to there,
>
> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> drivers/interconnect/core.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
> return;
>
> mutex_lock(&icc_lock);
> +
> + if (node->id >= ICC_DYN_ID_START) {
> + /*
> + * Memory allocation must be done outside of codepaths
> + * protected by icc_bw_lock.
> + */
> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> + node->name, dev_name(provider->dev));
> + }
The node name has already been set by the caller and the node has not
been added yet, so I think you should move this before taking the
icc_lock.
> +
> mutex_lock(&icc_bw_lock);
>
> node->provider = provider;
With that addressed, feel free to add my:
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Johan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-06-03 10:01 ` Bryan O'Donoghue
@ 2025-06-18 12:50 ` Johan Hovold
2025-06-18 13:28 ` Bryan O'Donoghue
0 siblings, 1 reply; 8+ messages in thread
From: Johan Hovold @ 2025-06-18 12:50 UTC (permalink / raw)
To: Bryan O'Donoghue, Rob Clark
Cc: Gabor Juhos, Bryan O'Donoghue, Georgi Djakov,
Raviteja Laggyshetty, linux-pm, linux-arm-msm, linux-kernel
[ +CC: Rob ]
On Tue, Jun 03, 2025 at 10:01:31AM +0000, Bryan O'Donoghue wrote:
> On 03/06/2025 10:15, Gabor Juhos wrote:
> > 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
> >> On 29/05/2025 15:46, Gabor Juhos wrote:
> >>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> >>> ("interconnect: Fix locking for runpm vs reclaim") in order
> >>> to decouple serialization of bw aggregation from codepaths
> >>> that require memory allocation.
> >>>
> >>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> >>> id allocation support") added a devm_kasprintf() call into a
> >>> path protected by the 'icc_bw_lock' which causes this lockdep
> >>> warning (at least on the IPQ9574 platform):
> >>> Move the memory allocation part of the code outside of the protected
> >>> path to eliminate the warning. Also add a note about why it is moved
> >>> to there,
> >>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>> return;
> >>>
> >>> mutex_lock(&icc_lock);
> >>> +
> >>> + if (node->id >= ICC_DYN_ID_START) {
> >>> + /*
> >>> + * Memory allocation must be done outside of codepaths
> >>> + * protected by icc_bw_lock.
> >>> + */
> >>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> + node->name, dev_name(provider->dev));
> >>> + }
> >>> +
> >>> mutex_lock(&icc_bw_lock);
> >>>
> >>> node->provider = provider;
> >>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>> node->avg_bw = node->init_avg;
> >>> node->peak_bw = node->init_peak;
> >>>
> >>> - if (node->id >= ICC_DYN_ID_START)
> >>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> - node->name, dev_name(provider->dev));
> >>> -
> >>> if (node->avg_bw || node->peak_bw) {
> >>> if (provider->pre_aggregate)
> >>> provider->pre_aggregate(node);
> >> The locking in this code is a mess.
> >>
> >> Which data-structures does icc_lock protect node* pointers I think and which
> >> data-structures does icc_bw_lock protect - "bw" data structures ?
> >>
> >> Hmm.
> >>
> >> Looking at this code I'm not sure at all what icc_lock was introduced to do.
> >
> > Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
> > The 'icc_bw_lock' has been introduced later by commit af42269c3523
> > ("interconnect: Fix locking for runpm vs reclaim") as part of the
> > "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
> >
> > Here is the reason copied from the original commit message:
> >
> > "For cases where icc_bw_set() can be called in callbaths that could
> > deadlock against shrinker/reclaim, such as runpm resume, we need to
> > decouple the icc locking. Introduce a new icc_bw_lock for cases where
> > we need to serialize bw aggregation and update to decouple that from
> > paths that require memory allocation such as node/link creation/
> > destruction."
>
> Right but reading this code.
>
> icc_set_bw();
> icc_lock_bw - protects struct icc_node *
>
> icc_put();
> icc_lock - locks
> icc_lock_bw -locks directly after protects struct icc_node *
>
> icc_node_add current:
> icc_lock - locks
> icc_lock_bw - locks
> node->name = devm_kasprintf();
>
> After your change
>
> icc_node_add current:
> icc_lock - locks
> node->name = devm_kasprintf();
> icc_lock_bw - locks
> owns node->provider - or whatever
>
> And this is what is prompting my question. Which locks own which data here ?
>
> I think we should sort that out, either by removing one of the locks or
> by at the very least documenting beside the mutex declarations which
> locks protect what.
Feel free to discuss that with Rob who added the icc_lock_bw, but it's
unrelated to the regression at hand (and should not block fixing it).
Allocations cannot be done while holding the icc_lock_bw, and this fix
is correct in moving the allocation (also note that the node has not
been added yet).
Johan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-06-18 12:50 ` Johan Hovold
@ 2025-06-18 13:28 ` Bryan O'Donoghue
0 siblings, 0 replies; 8+ messages in thread
From: Bryan O'Donoghue @ 2025-06-18 13:28 UTC (permalink / raw)
To: Johan Hovold, Bryan O'Donoghue, Rob Clark
Cc: Gabor Juhos, Georgi Djakov, Raviteja Laggyshetty, linux-pm,
linux-arm-msm, linux-kernel
On 18/06/2025 13:50, Johan Hovold wrote:
>> I think we should sort that out, either by removing one of the locks or
>> by at the very least documenting beside the mutex declarations which
>> locks protect what.
> Feel free to discuss that with Rob who added the icc_lock_bw, but it's
> unrelated to the regression at hand (and should not block fixing it).
True.
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
2025-06-18 12:43 ` Johan Hovold
@ 2025-06-18 19:38 ` Gabor Juhos
0 siblings, 0 replies; 8+ messages in thread
From: Gabor Juhos @ 2025-06-18 19:38 UTC (permalink / raw)
To: Johan Hovold, Georgi Djakov
Cc: Raviteja Laggyshetty, Bryan O'Donoghue, linux-pm,
linux-arm-msm, linux-kernel
Hi Johan,
Thank you for taking a look into this.
2025. 06. 18. 14:43 keltezéssel, Johan Hovold írta:
> On Thu, May 29, 2025 at 04:46:22PM +0200, Gabor Juhos wrote:
>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>> ("interconnect: Fix locking for runpm vs reclaim") in order
>> to decouple serialization of bw aggregation from codepaths
>> that require memory allocation.
>>
>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>> id allocation support") added a devm_kasprintf() call into a
>> path protected by the 'icc_bw_lock' which causes this lockdep
>> warning (at least on the IPQ9574 platform):
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 6.15.0-next-20250529 #0 Not tainted
>> ------------------------------------------------------
>> swapper/0/1 is trying to acquire lock:
>> ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108
>>
>> but task is already holding lock:
>> ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
>>
>> which lock already depends on the new lock.
>
> Thanks for fixing this. I get a similar splat with sc8280xp and the
> icc_ism_l3 driver since 6.16-rc1.
>
> Georgi, this is a regression that prevents lockdep from being used on a
> bunch of Qualcomm platforms and should be fixed in mainline ASAP (e.g.
> to avoid further locking issues from being introduced).
>
>> Move the memory allocation part of the code outside of the protected
>> path to eliminate the warning. Also add a note about why it is moved
>> to there,
>>
>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>> ---
>> drivers/interconnect/core.c | 14 ++++++++++----
>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>> index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
>> --- a/drivers/interconnect/core.c
>> +++ b/drivers/interconnect/core.c
>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
>> return;
>>
>> mutex_lock(&icc_lock);
>> +
>> + if (node->id >= ICC_DYN_ID_START) {
>> + /*
>> + * Memory allocation must be done outside of codepaths
>> + * protected by icc_bw_lock.
>> + */
>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>> + node->name, dev_name(provider->dev));
>> + }
>
> The node name has already been set by the caller and the node has not
> been added yet, so I think you should move this before taking the
> icc_lock.
It seems reasonable. I will send a modified version, which also contains
handling of memory allocation failures.
>
>> +
>> mutex_lock(&icc_bw_lock);
>>
>> node->provider = provider;
>
> With that addressed, feel free to add my:
>
> Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
> Tested-by: Johan Hovold <johan+linaro@kernel.org>
Regards,
Gabor
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-06-18 19:38 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <TIkPOGVjPeCjPzjVtlSb6V5CIcpaXf2-6WG6HjAyaOW59Hj01-9VK7Z8DKadakOKr6fJeQICi6h0Z8mft9DQyg==@protonmail.internalid>
2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos
2025-05-30 9:16 ` Bryan O'Donoghue
2025-06-03 9:15 ` Gabor Juhos
2025-06-03 10:01 ` Bryan O'Donoghue
2025-06-18 12:50 ` Johan Hovold
2025-06-18 13:28 ` Bryan O'Donoghue
2025-06-18 12:43 ` Johan Hovold
2025-06-18 19:38 ` Gabor Juhos
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).