linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held
@ 2025-06-18 19:58 Gabor Juhos
  2025-06-19 10:07 ` Johan Hovold
  0 siblings, 1 reply; 5+ messages in thread
From: Gabor Juhos @ 2025-06-18 19:58 UTC (permalink / raw)
  To: Georgi Djakov, Raviteja Laggyshetty, Johan Hovold,
	Bryan O'Donoghue
  Cc: linux-pm, linux-arm-msm, linux-kernel, Gabor Juhos

The 'icc_bw_lock' mutex is introduced in commit af42269c3523
("interconnect: Fix locking for runpm vs reclaim") in order
to decouple serialization of bw aggregation from codepaths
that require memory allocation.

However commit d30f83d278a9 ("interconnect: core: Add dynamic
id allocation support") added a devm_kasprintf() call into a
path protected by the 'icc_bw_lock' which causes this lockdep
warning (at least on the IPQ9574 platform):

    ======================================================
    WARNING: possible circular locking dependency detected
    6.15.0-next-20250529 #0 Not tainted
    ------------------------------------------------------
    swapper/0/1 is trying to acquire lock:
    ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108

    but task is already holding lock:
    ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (fs_reclaim){+.+.}-{0:0}:
           fs_reclaim_acquire+0x7c/0xb8
           slab_alloc_node.isra.0+0x48/0x188
           __kmalloc_node_track_caller_noprof+0xa4/0x2b8
           devm_kmalloc+0x5c/0x138
           devm_kvasprintf+0x6c/0xb8
           devm_kasprintf+0x50/0x68
           icc_node_add+0xbc/0x160
           icc_clk_register+0x15c/0x230
           devm_icc_clk_register+0x20/0x90
           qcom_cc_really_probe+0x320/0x338
           nss_cc_ipq9574_probe+0xac/0x1e8
           platform_probe+0x70/0xd0
           really_probe+0xdc/0x3b8
           __driver_probe_device+0x94/0x178
           driver_probe_device+0x48/0xf0
           __driver_attach+0x13c/0x208
           bus_for_each_dev+0x6c/0xb8
           driver_attach+0x2c/0x40
           bus_add_driver+0x100/0x250
           driver_register+0x68/0x138
           __platform_driver_register+0x2c/0x40
           nss_cc_ipq9574_driver_init+0x24/0x38
           do_one_initcall+0x88/0x340
           kernel_init_freeable+0x2ac/0x4f8
           kernel_init+0x28/0x1e8
           ret_from_fork+0x10/0x20

    -> #0 (icc_bw_lock){+.+.}-{4:4}:
           __lock_acquire+0x1348/0x2090
           lock_acquire+0x108/0x2d8
           icc_init+0x50/0x108
           do_one_initcall+0x88/0x340
           kernel_init_freeable+0x2ac/0x4f8
           kernel_init+0x28/0x1e8
           ret_from_fork+0x10/0x20

    other info that might help us debug this:

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(fs_reclaim);
                                   lock(icc_bw_lock);
                                   lock(fs_reclaim);
      lock(icc_bw_lock);

     *** DEADLOCK ***

    1 lock held by swapper/0/1:
     #0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108

    stack backtrace:
    CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE
    Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
    Call trace:
     show_stack+0x20/0x38 (C)
     dump_stack_lvl+0x90/0xd0
     dump_stack+0x18/0x28
     print_circular_bug+0x334/0x448
     check_noncircular+0x12c/0x140
     __lock_acquire+0x1348/0x2090
     lock_acquire+0x108/0x2d8
     icc_init+0x50/0x108
     do_one_initcall+0x88/0x340
     kernel_init_freeable+0x2ac/0x4f8
     kernel_init+0x28/0x1e8
     ret_from_fork+0x10/0x20

Move the memory allocation part of the code outside of the protected
path to eliminate the warning, and add a note about why it is moved
to there. Also add memory allocation failure handling, while we are
at it.

Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
Changes in v2:
  - move memory allocation outside of icc_lock
  - issue a warning and return without modifying the node name in case of
    memory allocation failure, and adjust the commit description
  - remove offered tags from Johan and Bryan
    Note: since I was not sure that that the added WARN_ON() is a substantial
    change or not, I have removed the offered tags intentionally to be on the
    safe side
---
 drivers/interconnect/core.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 1a41e59c77f85a811f78986e98401625f4cadfa3..32d969c349093bc356dc66234c62484aa9b9e872 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1022,6 +1022,21 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 	if (WARN_ON(node->provider))
 		return;
 
+	if (node->id >= ICC_DYN_ID_START) {
+		char *name;
+
+		/*
+		 * Memory allocation must be done outside of codepaths
+		 * protected by icc_bw_lock.
+		 */
+		name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
+				      node->name, dev_name(provider->dev));
+		if (WARN_ON(!name))
+			return;
+
+		node->name = name;
+	}
+
 	mutex_lock(&icc_lock);
 	mutex_lock(&icc_bw_lock);
 
@@ -1038,10 +1053,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 	node->avg_bw = node->init_avg;
 	node->peak_bw = node->init_peak;
 
-	if (node->id >= ICC_DYN_ID_START)
-		node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
-					    node->name, dev_name(provider->dev));
-
 	if (node->avg_bw || node->peak_bw) {
 		if (provider->pre_aggregate)
 			provider->pre_aggregate(node);

---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250529-icc-bw-lockdep-ed030d892a19

Best regards,
-- 
Gabor Juhos <j4g8y7@gmail.com>


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held
  2025-06-18 19:58 [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos
@ 2025-06-19 10:07 ` Johan Hovold
  2025-06-19 13:03   ` Gabor Juhos
  0 siblings, 1 reply; 5+ messages in thread
From: Johan Hovold @ 2025-06-19 10:07 UTC (permalink / raw)
  To: Gabor Juhos
  Cc: Georgi Djakov, Raviteja Laggyshetty, Johan Hovold,
	Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel

On Wed, Jun 18, 2025 at 09:58:31PM +0200, Gabor Juhos wrote:
> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> ("interconnect: Fix locking for runpm vs reclaim") in order
> to decouple serialization of bw aggregation from codepaths
> that require memory allocation.
> 
> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> id allocation support") added a devm_kasprintf() call into a
> path protected by the 'icc_bw_lock' which causes this lockdep
> warning (at least on the IPQ9574 platform):
> 
>     ======================================================
>     WARNING: possible circular locking dependency detected
>     6.15.0-next-20250529 #0 Not tainted

> Move the memory allocation part of the code outside of the protected
> path to eliminate the warning, and add a note about why it is moved
> to there. Also add memory allocation failure handling, while we are
> at it.
> 
> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> Changes in v2:
>   - move memory allocation outside of icc_lock
>   - issue a warning and return without modifying the node name in case of
>     memory allocation failure, and adjust the commit description
>   - remove offered tags from Johan and Bryan
>     Note: since I was not sure that that the added WARN_ON() is a substantial
>     change or not, I have removed the offered tags intentionally to be on the
>     safe side

Bah, what a mess (thanks for dropping the tags).

This dynamic id feature looks like a very ad-hoc and badly designed
interface.

icc_node_add() should not be allocating memory in the first place as it
is not designed to ever fail (e.g. does not return errors).

Generating the name could have been done as part of of
icc_node_create_dyn() or yet another helper for the caller could have
been added for that. In any case, it should be done before calling
icc_node_add().

Perhaps the best minimal fix of the regression is to move the allocation
into the two users of this interface. They already handle both dynamic
and non-dynamic node allocation explicitly.

Then whoever cares about this code can come up with a common interface
for allocating the name (e.g. move it into icc_node_create_dyn() or add
a new icc_node_init() helper or similar).

> ---
>  drivers/interconnect/core.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index 1a41e59c77f85a811f78986e98401625f4cadfa3..32d969c349093bc356dc66234c62484aa9b9e872 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -1022,6 +1022,21 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
>  	if (WARN_ON(node->provider))
>  		return;
>  
> +	if (node->id >= ICC_DYN_ID_START) {
> +		char *name;
> +
> +		/*
> +		 * Memory allocation must be done outside of codepaths
> +		 * protected by icc_bw_lock.
> +		 */
> +		name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> +				      node->name, dev_name(provider->dev));
> +		if (WARN_ON(!name))
> +			return;

But this won't do. We'd need to return an error to the caller (even if
this small allocation will never fail in practice).

> +
> +		node->name = name;
> +	}

Johan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held
  2025-06-19 10:07 ` Johan Hovold
@ 2025-06-19 13:03   ` Gabor Juhos
  2025-06-23  8:58     ` Johan Hovold
  0 siblings, 1 reply; 5+ messages in thread
From: Gabor Juhos @ 2025-06-19 13:03 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Georgi Djakov, Raviteja Laggyshetty, Johan Hovold,
	Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel

2025. 06. 19. 12:07 keltezéssel, Johan Hovold írta:
> On Wed, Jun 18, 2025 at 09:58:31PM +0200, Gabor Juhos wrote:
>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>> ("interconnect: Fix locking for runpm vs reclaim") in order
>> to decouple serialization of bw aggregation from codepaths
>> that require memory allocation.
>>
>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>> id allocation support") added a devm_kasprintf() call into a
>> path protected by the 'icc_bw_lock' which causes this lockdep
>> warning (at least on the IPQ9574 platform):
>>
>>     ======================================================
>>     WARNING: possible circular locking dependency detected
>>     6.15.0-next-20250529 #0 Not tainted
> 
>> Move the memory allocation part of the code outside of the protected
>> path to eliminate the warning, and add a note about why it is moved
>> to there. Also add memory allocation failure handling, while we are
>> at it.
>>
>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>> ---
>> Changes in v2:
>>   - move memory allocation outside of icc_lock
>>   - issue a warning and return without modifying the node name in case of
>>     memory allocation failure, and adjust the commit description
>>   - remove offered tags from Johan and Bryan
>>     Note: since I was not sure that that the added WARN_ON() is a substantial
>>     change or not, I have removed the offered tags intentionally to be on the
>>     safe side
> 
> Bah, what a mess (thanks for dropping the tags).
> 
> This dynamic id feature looks like a very ad-hoc and badly designed
> interface.
> 
> icc_node_add() should not be allocating memory in the first place as it
> is not designed to ever fail (e.g. does not return errors).
> 
> Generating the name could have been done as part of of
> icc_node_create_dyn() or yet another helper for the caller could have
> been added for that. In any case, it should be done before calling
> icc_node_add().
> 
> Perhaps the best minimal fix of the regression is to move the allocation
> into the two users of this interface. They already handle both dynamic
> and non-dynamic node allocation explicitly.

Ok, I will change the patch. Just to be clear, do you mean the
qcom_icc_rpmh_probe() and qcom_osm_l3_probe() functions, right?

> 
> Then whoever cares about this code can come up with a common interface
> for allocating the name (e.g. move it into icc_node_create_dyn() or add
> a new icc_node_init() helper or similar).
> 
>> ---
>>  drivers/interconnect/core.c | 19 +++++++++++++++----
>>  1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>> index 1a41e59c77f85a811f78986e98401625f4cadfa3..32d969c349093bc356dc66234c62484aa9b9e872 100644
>> --- a/drivers/interconnect/core.c
>> +++ b/drivers/interconnect/core.c
>> @@ -1022,6 +1022,21 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
>>  	if (WARN_ON(node->provider))
>>  		return;
>>  
>> +	if (node->id >= ICC_DYN_ID_START) {
>> +		char *name;
>> +
>> +		/*
>> +		 * Memory allocation must be done outside of codepaths
>> +		 * protected by icc_bw_lock.
>> +		 */
>> +		name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>> +				      node->name, dev_name(provider->dev));
>> +		if (WARN_ON(!name))
>> +			return;
> 
> But this won't do. We'd need to return an error to the caller (even if
> this small allocation will never fail in practice).

I admit that it is ugly, but I thought that an explicit warning is better than a
hidden null pointer dereference.

Regards,
Gabor

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held
  2025-06-19 13:03   ` Gabor Juhos
@ 2025-06-23  8:58     ` Johan Hovold
  2025-06-23 15:21       ` Gabor Juhos
  0 siblings, 1 reply; 5+ messages in thread
From: Johan Hovold @ 2025-06-23  8:58 UTC (permalink / raw)
  To: Gabor Juhos
  Cc: Georgi Djakov, Raviteja Laggyshetty, Johan Hovold,
	Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel,
	Bjorn Andersson

[ +CC: Bjorn ]

On Thu, Jun 19, 2025 at 03:03:50PM +0200, Gabor Juhos wrote:
> 2025. 06. 19. 12:07 keltezéssel, Johan Hovold írta:
> > On Wed, Jun 18, 2025 at 09:58:31PM +0200, Gabor Juhos wrote:
> >> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> >> ("interconnect: Fix locking for runpm vs reclaim") in order
> >> to decouple serialization of bw aggregation from codepaths
> >> that require memory allocation.
> >>
> >> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> >> id allocation support") added a devm_kasprintf() call into a
> >> path protected by the 'icc_bw_lock' which causes this lockdep
> >> warning (at least on the IPQ9574 platform):
> >>
> >>     ======================================================
> >>     WARNING: possible circular locking dependency detected
> >>     6.15.0-next-20250529 #0 Not tainted
> > 
> >> Move the memory allocation part of the code outside of the protected
> >> path to eliminate the warning, and add a note about why it is moved
> >> to there. Also add memory allocation failure handling, while we are
> >> at it.
> >>
> >> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
> >> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> >> ---
> >> Changes in v2:
> >>   - move memory allocation outside of icc_lock
> >>   - issue a warning and return without modifying the node name in case of
> >>     memory allocation failure, and adjust the commit description
> >>   - remove offered tags from Johan and Bryan
> >>     Note: since I was not sure that that the added WARN_ON() is a substantial
> >>     change or not, I have removed the offered tags intentionally to be on the
> >>     safe side
> > 
> > Bah, what a mess (thanks for dropping the tags).
> > 
> > This dynamic id feature looks like a very ad-hoc and badly designed
> > interface.
> > 
> > icc_node_add() should not be allocating memory in the first place as it
> > is not designed to ever fail (e.g. does not return errors).
> > 
> > Generating the name could have been done as part of of
> > icc_node_create_dyn() or yet another helper for the caller could have
> > been added for that. In any case, it should be done before calling
> > icc_node_add().
> > 
> > Perhaps the best minimal fix of the regression is to move the allocation
> > into the two users of this interface. They already handle both dynamic
> > and non-dynamic node allocation explicitly.
> 
> Ok, I will change the patch. Just to be clear, do you mean the
> qcom_icc_rpmh_probe() and qcom_osm_l3_probe() functions, right?

Yes, indeed.

Apparently this is how it was done in the first six iterations of the
series adding this and then the author was asked to generalise the name
generation. That can still be done as a follow up (by the Qualcomm
folks) after fixing the immediate issues:

	https://lore.kernel.org/all/lm6gvcrnd2pcphex4pugxie7m47qlvrgvsvuf75w4uumwoouew@qcuvxeb3u72s

> > Then whoever cares about this code can come up with a common interface
> > for allocating the name (e.g. move it into icc_node_create_dyn() or add
> > a new icc_node_init() helper or similar).

Johan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held
  2025-06-23  8:58     ` Johan Hovold
@ 2025-06-23 15:21       ` Gabor Juhos
  0 siblings, 0 replies; 5+ messages in thread
From: Gabor Juhos @ 2025-06-23 15:21 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Georgi Djakov, Raviteja Laggyshetty, Johan Hovold,
	Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel,
	Bjorn Andersson

2025. 06. 23. 10:58 keltezéssel, Johan Hovold írta:
> [ +CC: Bjorn ]
> 
> On Thu, Jun 19, 2025 at 03:03:50PM +0200, Gabor Juhos wrote:
>> 2025. 06. 19. 12:07 keltezéssel, Johan Hovold írta:
>>> On Wed, Jun 18, 2025 at 09:58:31PM +0200, Gabor Juhos wrote:
>>>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>>>> ("interconnect: Fix locking for runpm vs reclaim") in order
>>>> to decouple serialization of bw aggregation from codepaths
>>>> that require memory allocation.
>>>>
>>>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>>>> id allocation support") added a devm_kasprintf() call into a
>>>> path protected by the 'icc_bw_lock' which causes this lockdep
>>>> warning (at least on the IPQ9574 platform):
>>>>
>>>>     ======================================================
>>>>     WARNING: possible circular locking dependency detected
>>>>     6.15.0-next-20250529 #0 Not tainted
>>>
>>>> Move the memory allocation part of the code outside of the protected
>>>> path to eliminate the warning, and add a note about why it is moved
>>>> to there. Also add memory allocation failure handling, while we are
>>>> at it.
>>>>
>>>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>>>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>>>> ---
>>>> Changes in v2:
>>>>   - move memory allocation outside of icc_lock
>>>>   - issue a warning and return without modifying the node name in case of
>>>>     memory allocation failure, and adjust the commit description
>>>>   - remove offered tags from Johan and Bryan
>>>>     Note: since I was not sure that that the added WARN_ON() is a substantial
>>>>     change or not, I have removed the offered tags intentionally to be on the
>>>>     safe side
>>>
>>> Bah, what a mess (thanks for dropping the tags).
>>>
>>> This dynamic id feature looks like a very ad-hoc and badly designed
>>> interface.
>>>
>>> icc_node_add() should not be allocating memory in the first place as it
>>> is not designed to ever fail (e.g. does not return errors).
>>>
>>> Generating the name could have been done as part of of
>>> icc_node_create_dyn() or yet another helper for the caller could have
>>> been added for that. In any case, it should be done before calling
>>> icc_node_add().
>>>
>>> Perhaps the best minimal fix of the regression is to move the allocation
>>> into the two users of this interface. They already handle both dynamic
>>> and non-dynamic node allocation explicitly.
>>
>> Ok, I will change the patch. Just to be clear, do you mean the
>> qcom_icc_rpmh_probe() and qcom_osm_l3_probe() functions, right?
> 
> Yes, indeed.

Ok.

> 
> Apparently this is how it was done in the first six iterations of the
> series adding this and then the author was asked to generalise the name
> generation. That can still be done as a follow up (by the Qualcomm
> folks) after fixing the immediate issues:
> 
> 	https://lore.kernel.org/all/lm6gvcrnd2pcphex4pugxie7m47qlvrgvsvuf75w4uumwoouew@qcuvxeb3u72s


Thanks for digging this out, I have only checked the last two iterations.

>>> Then whoever cares about this code can come up with a common interface
>>> for allocating the name (e.g. move it into icc_node_create_dyn() or add
>>> a new icc_node_init() helper or similar).
> 

Regards,
Gabor

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-23 15:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-18 19:58 [PATCH v2] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos
2025-06-19 10:07 ` Johan Hovold
2025-06-19 13:03   ` Gabor Juhos
2025-06-23  8:58     ` Johan Hovold
2025-06-23 15:21       ` Gabor Juhos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).