* [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held @ 2025-05-29 14:46 ` Gabor Juhos 2025-05-30 9:16 ` Bryan O'Donoghue 2025-06-18 12:43 ` Johan Hovold 0 siblings, 2 replies; 8+ messages in thread From: Gabor Juhos @ 2025-05-29 14:46 UTC (permalink / raw) To: Georgi Djakov, Raviteja Laggyshetty Cc: linux-pm, linux-arm-msm, linux-kernel, Gabor Juhos The 'icc_bw_lock' mutex is introduced in commit af42269c3523 ("interconnect: Fix locking for runpm vs reclaim") in order to decouple serialization of bw aggregation from codepaths that require memory allocation. However commit d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") added a devm_kasprintf() call into a path protected by the 'icc_bw_lock' which causes this lockdep warning (at least on the IPQ9574 platform): ====================================================== WARNING: possible circular locking dependency detected 6.15.0-next-20250529 #0 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108 but task is already holding lock: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x7c/0xb8 slab_alloc_node.isra.0+0x48/0x188 __kmalloc_node_track_caller_noprof+0xa4/0x2b8 devm_kmalloc+0x5c/0x138 devm_kvasprintf+0x6c/0xb8 devm_kasprintf+0x50/0x68 icc_node_add+0xbc/0x160 icc_clk_register+0x15c/0x230 devm_icc_clk_register+0x20/0x90 qcom_cc_really_probe+0x320/0x338 nss_cc_ipq9574_probe+0xac/0x1e8 platform_probe+0x70/0xd0 really_probe+0xdc/0x3b8 __driver_probe_device+0x94/0x178 driver_probe_device+0x48/0xf0 __driver_attach+0x13c/0x208 bus_for_each_dev+0x6c/0xb8 driver_attach+0x2c/0x40 bus_add_driver+0x100/0x250 driver_register+0x68/0x138 __platform_driver_register+0x2c/0x40 nss_cc_ipq9574_driver_init+0x24/0x38 do_one_initcall+0x88/0x340 kernel_init_freeable+0x2ac/0x4f8 kernel_init+0x28/0x1e8 ret_from_fork+0x10/0x20 -> #0 (icc_bw_lock){+.+.}-{4:4}: __lock_acquire+0x1348/0x2090 lock_acquire+0x108/0x2d8 icc_init+0x50/0x108 do_one_initcall+0x88/0x340 kernel_init_freeable+0x2ac/0x4f8 kernel_init+0x28/0x1e8 ret_from_fork+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(icc_bw_lock); lock(fs_reclaim); lock(icc_bw_lock); *** DEADLOCK *** 1 lock held by swapper/0/1: #0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 stack backtrace: CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT) Call trace: show_stack+0x20/0x38 (C) dump_stack_lvl+0x90/0xd0 dump_stack+0x18/0x28 print_circular_bug+0x334/0x448 check_noncircular+0x12c/0x140 __lock_acquire+0x1348/0x2090 lock_acquire+0x108/0x2d8 icc_init+0x50/0x108 do_one_initcall+0x88/0x340 kernel_init_freeable+0x2ac/0x4f8 kernel_init+0x28/0x1e8 ret_from_fork+0x10/0x20 Move the memory allocation part of the code outside of the protected path to eliminate the warning. Also add a note about why it is moved to there, Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> --- drivers/interconnect/core.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 --- a/drivers/interconnect/core.c +++ b/drivers/interconnect/core.c @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) return; mutex_lock(&icc_lock); + + if (node->id >= ICC_DYN_ID_START) { + /* + * Memory allocation must be done outside of codepaths + * protected by icc_bw_lock. + */ + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", + node->name, dev_name(provider->dev)); + } + mutex_lock(&icc_bw_lock); node->provider = provider; @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) node->avg_bw = node->init_avg; node->peak_bw = node->init_peak; - if (node->id >= ICC_DYN_ID_START) - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", - node->name, dev_name(provider->dev)); - if (node->avg_bw || node->peak_bw) { if (provider->pre_aggregate) provider->pre_aggregate(node); --- base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2 change-id: 20250529-icc-bw-lockdep-ed030d892a19 Best regards, -- Gabor Juhos <j4g8y7@gmail.com> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos @ 2025-05-30 9:16 ` Bryan O'Donoghue 2025-06-03 9:15 ` Gabor Juhos 2025-06-18 12:43 ` Johan Hovold 1 sibling, 1 reply; 8+ messages in thread From: Bryan O'Donoghue @ 2025-05-30 9:16 UTC (permalink / raw) To: Gabor Juhos, Georgi Djakov, Raviteja Laggyshetty Cc: linux-pm, linux-arm-msm, linux-kernel On 29/05/2025 15:46, Gabor Juhos wrote: > The 'icc_bw_lock' mutex is introduced in commit af42269c3523 > ("interconnect: Fix locking for runpm vs reclaim") in order > to decouple serialization of bw aggregation from codepaths > that require memory allocation. > > However commit d30f83d278a9 ("interconnect: core: Add dynamic > id allocation support") added a devm_kasprintf() call into a > path protected by the 'icc_bw_lock' which causes this lockdep > warning (at least on the IPQ9574 platform): Missing a Fixes tag. > > ====================================================== > WARNING: possible circular locking dependency detected > 6.15.0-next-20250529 #0 Not tainted > ------------------------------------------------------ > swapper/0/1 is trying to acquire lock: > ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108 > > but task is already holding lock: > ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #1 (fs_reclaim){+.+.}-{0:0}: > fs_reclaim_acquire+0x7c/0xb8 > slab_alloc_node.isra.0+0x48/0x188 > __kmalloc_node_track_caller_noprof+0xa4/0x2b8 > devm_kmalloc+0x5c/0x138 > devm_kvasprintf+0x6c/0xb8 > devm_kasprintf+0x50/0x68 > icc_node_add+0xbc/0x160 > icc_clk_register+0x15c/0x230 > devm_icc_clk_register+0x20/0x90 > qcom_cc_really_probe+0x320/0x338 > nss_cc_ipq9574_probe+0xac/0x1e8 > platform_probe+0x70/0xd0 > really_probe+0xdc/0x3b8 > __driver_probe_device+0x94/0x178 > driver_probe_device+0x48/0xf0 > __driver_attach+0x13c/0x208 > bus_for_each_dev+0x6c/0xb8 > driver_attach+0x2c/0x40 > bus_add_driver+0x100/0x250 > driver_register+0x68/0x138 > __platform_driver_register+0x2c/0x40 > nss_cc_ipq9574_driver_init+0x24/0x38 > do_one_initcall+0x88/0x340 > kernel_init_freeable+0x2ac/0x4f8 > kernel_init+0x28/0x1e8 > ret_from_fork+0x10/0x20 > > -> #0 (icc_bw_lock){+.+.}-{4:4}: > __lock_acquire+0x1348/0x2090 > lock_acquire+0x108/0x2d8 > icc_init+0x50/0x108 > do_one_initcall+0x88/0x340 > kernel_init_freeable+0x2ac/0x4f8 > kernel_init+0x28/0x1e8 > ret_from_fork+0x10/0x20 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(fs_reclaim); > lock(icc_bw_lock); > lock(fs_reclaim); > lock(icc_bw_lock); > > *** DEADLOCK *** > > 1 lock held by swapper/0/1: > #0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 > > stack backtrace: > CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE > Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT) > Call trace: > show_stack+0x20/0x38 (C) > dump_stack_lvl+0x90/0xd0 > dump_stack+0x18/0x28 > print_circular_bug+0x334/0x448 > check_noncircular+0x12c/0x140 > __lock_acquire+0x1348/0x2090 > lock_acquire+0x108/0x2d8 > icc_init+0x50/0x108 > do_one_initcall+0x88/0x340 > kernel_init_freeable+0x2ac/0x4f8 > kernel_init+0x28/0x1e8 > ret_from_fork+0x10/0x20 > > Move the memory allocation part of the code outside of the protected > path to eliminate the warning. Also add a note about why it is moved > to there, > > Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") > Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> > --- > drivers/interconnect/core.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c > index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 > --- a/drivers/interconnect/core.c > +++ b/drivers/interconnect/core.c > @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) > return; > > mutex_lock(&icc_lock); > + > + if (node->id >= ICC_DYN_ID_START) { > + /* > + * Memory allocation must be done outside of codepaths > + * protected by icc_bw_lock. > + */ > + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", > + node->name, dev_name(provider->dev)); > + } > + > mutex_lock(&icc_bw_lock); > > node->provider = provider; > @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) > node->avg_bw = node->init_avg; > node->peak_bw = node->init_peak; > > - if (node->id >= ICC_DYN_ID_START) > - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", > - node->name, dev_name(provider->dev)); > - > if (node->avg_bw || node->peak_bw) { > if (provider->pre_aggregate) > provider->pre_aggregate(node); > > --- > base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2 > change-id: 20250529-icc-bw-lockdep-ed030d892a19 > > Best regards, > -- > Gabor Juhos <j4g8y7@gmail.com> > > The locking in this code is a mess. Which data-structures does icc_lock protect node* pointers I think and which data-structures does icc_bw_lock protect - "bw" data structures ? Hmm. Looking at this code I'm not sure at all what icc_lock was introduced to do. Can we not just drop it entirely ? --- bod ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-05-30 9:16 ` Bryan O'Donoghue @ 2025-06-03 9:15 ` Gabor Juhos 2025-06-03 10:01 ` Bryan O'Donoghue 0 siblings, 1 reply; 8+ messages in thread From: Gabor Juhos @ 2025-06-03 9:15 UTC (permalink / raw) To: Bryan O'Donoghue, Georgi Djakov, Raviteja Laggyshetty Cc: linux-pm, linux-arm-msm, linux-kernel Hello Bryan, Sorry for the late reply, I missed your mail. 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta: > On 29/05/2025 15:46, Gabor Juhos wrote: >> The 'icc_bw_lock' mutex is introduced in commit af42269c3523 >> ("interconnect: Fix locking for runpm vs reclaim") in order >> to decouple serialization of bw aggregation from codepaths >> that require memory allocation. >> >> However commit d30f83d278a9 ("interconnect: core: Add dynamic >> id allocation support") added a devm_kasprintf() call into a >> path protected by the 'icc_bw_lock' which causes this lockdep >> warning (at least on the IPQ9574 platform): > > Missing a Fixes tag. Erm, it is before my s-o-b tag. ... >> Move the memory allocation part of the code outside of the protected >> path to eliminate the warning. Also add a note about why it is moved >> to there, >> >> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") >> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> >> --- >> drivers/interconnect/core.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c >> index >> 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 >> --- a/drivers/interconnect/core.c >> +++ b/drivers/interconnect/core.c >> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct >> icc_provider *provider) >> return; >> >> mutex_lock(&icc_lock); >> + >> + if (node->id >= ICC_DYN_ID_START) { >> + /* >> + * Memory allocation must be done outside of codepaths >> + * protected by icc_bw_lock. >> + */ >> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", >> + node->name, dev_name(provider->dev)); >> + } >> + >> mutex_lock(&icc_bw_lock); >> >> node->provider = provider; >> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct >> icc_provider *provider) >> node->avg_bw = node->init_avg; >> node->peak_bw = node->init_peak; >> >> - if (node->id >= ICC_DYN_ID_START) >> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", >> - node->name, dev_name(provider->dev)); >> - >> if (node->avg_bw || node->peak_bw) { >> if (provider->pre_aggregate) >> provider->pre_aggregate(node); >> >> --- >> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2 >> change-id: 20250529-icc-bw-lockdep-ed030d892a19 >> >> Best regards, >> -- >> Gabor Juhos <j4g8y7@gmail.com> >> >> > > The locking in this code is a mess. > > Which data-structures does icc_lock protect node* pointers I think and which > data-structures does icc_bw_lock protect - "bw" data structures ? > > Hmm. > > Looking at this code I'm not sure at all what icc_lock was introduced to do. Initially, only the 'icc_lock' mutex was here, and that protected 'everything'. The 'icc_bw_lock' has been introduced later by commit af42269c3523 ("interconnect: Fix locking for runpm vs reclaim") as part of the "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1]. Here is the reason copied from the original commit message: "For cases where icc_bw_set() can be called in callbaths that could deadlock against shrinker/reclaim, such as runpm resume, we need to decouple the icc locking. Introduce a new icc_bw_lock for cases where we need to serialize bw aggregation and update to decouple that from paths that require memory allocation such as node/link creation/ destruction." > Can we not just drop it entirely ? I'm not an expert in locking, but I doubt that we can easily drop any of the two mutexes without reintroducing the problem fixed by the change mentioned above. [1] https://lore.kernel.org/all/20230807171148.210181-1-robdclark@gmail.com/ Regards, Gabor ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-06-03 9:15 ` Gabor Juhos @ 2025-06-03 10:01 ` Bryan O'Donoghue 2025-06-18 12:50 ` Johan Hovold 0 siblings, 1 reply; 8+ messages in thread From: Bryan O'Donoghue @ 2025-06-03 10:01 UTC (permalink / raw) To: Gabor Juhos, Bryan O'Donoghue, Georgi Djakov, Raviteja Laggyshetty Cc: linux-pm, linux-arm-msm, linux-kernel On 03/06/2025 10:15, Gabor Juhos wrote: > Hello Bryan, > > Sorry for the late reply, I missed your mail. > > 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta: >> On 29/05/2025 15:46, Gabor Juhos wrote: >>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523 >>> ("interconnect: Fix locking for runpm vs reclaim") in order >>> to decouple serialization of bw aggregation from codepaths >>> that require memory allocation. >>> >>> However commit d30f83d278a9 ("interconnect: core: Add dynamic >>> id allocation support") added a devm_kasprintf() call into a >>> path protected by the 'icc_bw_lock' which causes this lockdep >>> warning (at least on the IPQ9574 platform): >> >> Missing a Fixes tag. > > Erm, it is before my s-o-b tag. Great thank you I see that. > ... > >>> Move the memory allocation part of the code outside of the protected >>> path to eliminate the warning. Also add a note about why it is moved >>> to there, >>> >>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") >>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> >>> --- >>> drivers/interconnect/core.c | 14 ++++++++++---- >>> 1 file changed, 10 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c >>> index >>> 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 >>> --- a/drivers/interconnect/core.c >>> +++ b/drivers/interconnect/core.c >>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct >>> icc_provider *provider) >>> return; >>> >>> mutex_lock(&icc_lock); >>> + >>> + if (node->id >= ICC_DYN_ID_START) { >>> + /* >>> + * Memory allocation must be done outside of codepaths >>> + * protected by icc_bw_lock. >>> + */ >>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", >>> + node->name, dev_name(provider->dev)); >>> + } >>> + >>> mutex_lock(&icc_bw_lock); >>> >>> node->provider = provider; >>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct >>> icc_provider *provider) >>> node->avg_bw = node->init_avg; >>> node->peak_bw = node->init_peak; >>> >>> - if (node->id >= ICC_DYN_ID_START) >>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", >>> - node->name, dev_name(provider->dev)); >>> - >>> if (node->avg_bw || node->peak_bw) { >>> if (provider->pre_aggregate) >>> provider->pre_aggregate(node); >>> >>> --- >>> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2 >>> change-id: 20250529-icc-bw-lockdep-ed030d892a19 >>> >>> Best regards, >>> -- >>> Gabor Juhos <j4g8y7@gmail.com> >>> >>> >> >> The locking in this code is a mess. >> >> Which data-structures does icc_lock protect node* pointers I think and which >> data-structures does icc_bw_lock protect - "bw" data structures ? >> >> Hmm. >> >> Looking at this code I'm not sure at all what icc_lock was introduced to do. > > Initially, only the 'icc_lock' mutex was here, and that protected 'everything'. > The 'icc_bw_lock' has been introduced later by commit af42269c3523 > ("interconnect: Fix locking for runpm vs reclaim") as part of the > "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1]. > > Here is the reason copied from the original commit message: > > "For cases where icc_bw_set() can be called in callbaths that could > deadlock against shrinker/reclaim, such as runpm resume, we need to > decouple the icc locking. Introduce a new icc_bw_lock for cases where > we need to serialize bw aggregation and update to decouple that from > paths that require memory allocation such as node/link creation/ > destruction." Right but reading this code. icc_set_bw(); icc_lock_bw - protects struct icc_node * icc_put(); icc_lock - locks icc_lock_bw -locks directly after protects struct icc_node * icc_node_add current: icc_lock - locks icc_lock_bw - locks node->name = devm_kasprintf(); After your change icc_node_add current: icc_lock - locks node->name = devm_kasprintf(); icc_lock_bw - locks owns node->provider - or whatever And this is what is prompting my question. Which locks own which data here ? I think we should sort that out, either by removing one of the locks or by at the very least documenting beside the mutex declarations which locks protect what. --- bod >> Can we not just drop it entirely ? > > I'm not an expert in locking, but I doubt that we can easily drop any of the two > mutexes without reintroducing the problem fixed by the change mentioned above. > > [1] https://lore.kernel.org/all/20230807171148.210181-1-robdclark@gmail.com/ > > Regards, > Gabor > > Right - if this were a struct we would declare what these individual mutexes lock. That's my question here, as I review this code, which mutex protects what ? I don't think that is particularly clear. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-06-03 10:01 ` Bryan O'Donoghue @ 2025-06-18 12:50 ` Johan Hovold 2025-06-18 13:28 ` Bryan O'Donoghue 0 siblings, 1 reply; 8+ messages in thread From: Johan Hovold @ 2025-06-18 12:50 UTC (permalink / raw) To: Bryan O'Donoghue, Rob Clark Cc: Gabor Juhos, Bryan O'Donoghue, Georgi Djakov, Raviteja Laggyshetty, linux-pm, linux-arm-msm, linux-kernel [ +CC: Rob ] On Tue, Jun 03, 2025 at 10:01:31AM +0000, Bryan O'Donoghue wrote: > On 03/06/2025 10:15, Gabor Juhos wrote: > > 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta: > >> On 29/05/2025 15:46, Gabor Juhos wrote: > >>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523 > >>> ("interconnect: Fix locking for runpm vs reclaim") in order > >>> to decouple serialization of bw aggregation from codepaths > >>> that require memory allocation. > >>> > >>> However commit d30f83d278a9 ("interconnect: core: Add dynamic > >>> id allocation support") added a devm_kasprintf() call into a > >>> path protected by the 'icc_bw_lock' which causes this lockdep > >>> warning (at least on the IPQ9574 platform): > >>> Move the memory allocation part of the code outside of the protected > >>> path to eliminate the warning. Also add a note about why it is moved > >>> to there, > >>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct > >>> icc_provider *provider) > >>> return; > >>> > >>> mutex_lock(&icc_lock); > >>> + > >>> + if (node->id >= ICC_DYN_ID_START) { > >>> + /* > >>> + * Memory allocation must be done outside of codepaths > >>> + * protected by icc_bw_lock. > >>> + */ > >>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", > >>> + node->name, dev_name(provider->dev)); > >>> + } > >>> + > >>> mutex_lock(&icc_bw_lock); > >>> > >>> node->provider = provider; > >>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct > >>> icc_provider *provider) > >>> node->avg_bw = node->init_avg; > >>> node->peak_bw = node->init_peak; > >>> > >>> - if (node->id >= ICC_DYN_ID_START) > >>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", > >>> - node->name, dev_name(provider->dev)); > >>> - > >>> if (node->avg_bw || node->peak_bw) { > >>> if (provider->pre_aggregate) > >>> provider->pre_aggregate(node); > >> The locking in this code is a mess. > >> > >> Which data-structures does icc_lock protect node* pointers I think and which > >> data-structures does icc_bw_lock protect - "bw" data structures ? > >> > >> Hmm. > >> > >> Looking at this code I'm not sure at all what icc_lock was introduced to do. > > > > Initially, only the 'icc_lock' mutex was here, and that protected 'everything'. > > The 'icc_bw_lock' has been introduced later by commit af42269c3523 > > ("interconnect: Fix locking for runpm vs reclaim") as part of the > > "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1]. > > > > Here is the reason copied from the original commit message: > > > > "For cases where icc_bw_set() can be called in callbaths that could > > deadlock against shrinker/reclaim, such as runpm resume, we need to > > decouple the icc locking. Introduce a new icc_bw_lock for cases where > > we need to serialize bw aggregation and update to decouple that from > > paths that require memory allocation such as node/link creation/ > > destruction." > > Right but reading this code. > > icc_set_bw(); > icc_lock_bw - protects struct icc_node * > > icc_put(); > icc_lock - locks > icc_lock_bw -locks directly after protects struct icc_node * > > icc_node_add current: > icc_lock - locks > icc_lock_bw - locks > node->name = devm_kasprintf(); > > After your change > > icc_node_add current: > icc_lock - locks > node->name = devm_kasprintf(); > icc_lock_bw - locks > owns node->provider - or whatever > > And this is what is prompting my question. Which locks own which data here ? > > I think we should sort that out, either by removing one of the locks or > by at the very least documenting beside the mutex declarations which > locks protect what. Feel free to discuss that with Rob who added the icc_lock_bw, but it's unrelated to the regression at hand (and should not block fixing it). Allocations cannot be done while holding the icc_lock_bw, and this fix is correct in moving the allocation (also note that the node has not been added yet). Johan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-06-18 12:50 ` Johan Hovold @ 2025-06-18 13:28 ` Bryan O'Donoghue 0 siblings, 0 replies; 8+ messages in thread From: Bryan O'Donoghue @ 2025-06-18 13:28 UTC (permalink / raw) To: Johan Hovold, Bryan O'Donoghue, Rob Clark Cc: Gabor Juhos, Georgi Djakov, Raviteja Laggyshetty, linux-pm, linux-arm-msm, linux-kernel On 18/06/2025 13:50, Johan Hovold wrote: >> I think we should sort that out, either by removing one of the locks or >> by at the very least documenting beside the mutex declarations which >> locks protect what. > Feel free to discuss that with Rob who added the icc_lock_bw, but it's > unrelated to the regression at hand (and should not block fixing it). True. Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos 2025-05-30 9:16 ` Bryan O'Donoghue @ 2025-06-18 12:43 ` Johan Hovold 2025-06-18 19:38 ` Gabor Juhos 1 sibling, 1 reply; 8+ messages in thread From: Johan Hovold @ 2025-06-18 12:43 UTC (permalink / raw) To: Gabor Juhos, Georgi Djakov Cc: Raviteja Laggyshetty, Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel On Thu, May 29, 2025 at 04:46:22PM +0200, Gabor Juhos wrote: > The 'icc_bw_lock' mutex is introduced in commit af42269c3523 > ("interconnect: Fix locking for runpm vs reclaim") in order > to decouple serialization of bw aggregation from codepaths > that require memory allocation. > > However commit d30f83d278a9 ("interconnect: core: Add dynamic > id allocation support") added a devm_kasprintf() call into a > path protected by the 'icc_bw_lock' which causes this lockdep > warning (at least on the IPQ9574 platform): > > ====================================================== > WARNING: possible circular locking dependency detected > 6.15.0-next-20250529 #0 Not tainted > ------------------------------------------------------ > swapper/0/1 is trying to acquire lock: > ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108 > > but task is already holding lock: > ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 > > which lock already depends on the new lock. Thanks for fixing this. I get a similar splat with sc8280xp and the icc_ism_l3 driver since 6.16-rc1. Georgi, this is a regression that prevents lockdep from being used on a bunch of Qualcomm platforms and should be fixed in mainline ASAP (e.g. to avoid further locking issues from being introduced). > Move the memory allocation part of the code outside of the protected > path to eliminate the warning. Also add a note about why it is moved > to there, > > Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") > Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> > --- > drivers/interconnect/core.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c > index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 > --- a/drivers/interconnect/core.c > +++ b/drivers/interconnect/core.c > @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) > return; > > mutex_lock(&icc_lock); > + > + if (node->id >= ICC_DYN_ID_START) { > + /* > + * Memory allocation must be done outside of codepaths > + * protected by icc_bw_lock. > + */ > + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", > + node->name, dev_name(provider->dev)); > + } The node name has already been set by the caller and the node has not been added yet, so I think you should move this before taking the icc_lock. > + > mutex_lock(&icc_bw_lock); > > node->provider = provider; With that addressed, feel free to add my: Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Johan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held 2025-06-18 12:43 ` Johan Hovold @ 2025-06-18 19:38 ` Gabor Juhos 0 siblings, 0 replies; 8+ messages in thread From: Gabor Juhos @ 2025-06-18 19:38 UTC (permalink / raw) To: Johan Hovold, Georgi Djakov Cc: Raviteja Laggyshetty, Bryan O'Donoghue, linux-pm, linux-arm-msm, linux-kernel Hi Johan, Thank you for taking a look into this. 2025. 06. 18. 14:43 keltezéssel, Johan Hovold írta: > On Thu, May 29, 2025 at 04:46:22PM +0200, Gabor Juhos wrote: >> The 'icc_bw_lock' mutex is introduced in commit af42269c3523 >> ("interconnect: Fix locking for runpm vs reclaim") in order >> to decouple serialization of bw aggregation from codepaths >> that require memory allocation. >> >> However commit d30f83d278a9 ("interconnect: core: Add dynamic >> id allocation support") added a devm_kasprintf() call into a >> path protected by the 'icc_bw_lock' which causes this lockdep >> warning (at least on the IPQ9574 platform): >> >> ====================================================== >> WARNING: possible circular locking dependency detected >> 6.15.0-next-20250529 #0 Not tainted >> ------------------------------------------------------ >> swapper/0/1 is trying to acquire lock: >> ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108 >> >> but task is already holding lock: >> ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108 >> >> which lock already depends on the new lock. > > Thanks for fixing this. I get a similar splat with sc8280xp and the > icc_ism_l3 driver since 6.16-rc1. > > Georgi, this is a regression that prevents lockdep from being used on a > bunch of Qualcomm platforms and should be fixed in mainline ASAP (e.g. > to avoid further locking issues from being introduced). > >> Move the memory allocation part of the code outside of the protected >> path to eliminate the warning. Also add a note about why it is moved >> to there, >> >> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support") >> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> >> --- >> drivers/interconnect/core.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c >> index 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644 >> --- a/drivers/interconnect/core.c >> +++ b/drivers/interconnect/core.c >> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) >> return; >> >> mutex_lock(&icc_lock); >> + >> + if (node->id >= ICC_DYN_ID_START) { >> + /* >> + * Memory allocation must be done outside of codepaths >> + * protected by icc_bw_lock. >> + */ >> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s", >> + node->name, dev_name(provider->dev)); >> + } > > The node name has already been set by the caller and the node has not > been added yet, so I think you should move this before taking the > icc_lock. It seems reasonable. I will send a modified version, which also contains handling of memory allocation failures. > >> + >> mutex_lock(&icc_bw_lock); >> >> node->provider = provider; > > With that addressed, feel free to add my: > > Reviewed-by: Johan Hovold <johan+linaro@kernel.org> > Tested-by: Johan Hovold <johan+linaro@kernel.org> Regards, Gabor ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-06-18 19:38 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <TIkPOGVjPeCjPzjVtlSb6V5CIcpaXf2-6WG6HjAyaOW59Hj01-9VK7Z8DKadakOKr6fJeQICi6h0Z8mft9DQyg==@protonmail.internalid> 2025-05-29 14:46 ` [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held Gabor Juhos 2025-05-30 9:16 ` Bryan O'Donoghue 2025-06-03 9:15 ` Gabor Juhos 2025-06-03 10:01 ` Bryan O'Donoghue 2025-06-18 12:50 ` Johan Hovold 2025-06-18 13:28 ` Bryan O'Donoghue 2025-06-18 12:43 ` Johan Hovold 2025-06-18 19:38 ` Gabor Juhos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).