From: Waiman Long <longman@redhat.com>
To: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, tuanphan@os.amperecomputing.com,
robin.murphy@arm.com, suzuki.poulose@arm.com
Subject: Re: [PATCH] perf/arm-dmc620: Reverse locking order in dmc620_pmu_get_irq()
Date: Tue, 11 Apr 2023 09:44:22 -0400 [thread overview]
Message-ID: <5a3d4bed-e59c-5728-ce92-c97c905cb0c8@redhat.com> (raw)
In-Reply-To: <20230411123823.GA22686@willie-the-truck>
On 4/11/23 08:38, Will Deacon wrote:
> Hi Waiman,
>
> [+Tuan Phan, Robin and Suzuki]
>
> On Wed, Apr 05, 2023 at 01:28:42PM -0400, Waiman Long wrote:
>> The following circular locking dependency was reported when running
>> cpus online/offline test on an arm64 system.
>>
>> [ 84.195923] Chain exists of:
>> dmc620_pmu_irqs_lock --> cpu_hotplug_lock --> cpuhp_state-down
>>
>> [ 84.207305] Possible unsafe locking scenario:
>>
>> [ 84.213212] CPU0 CPU1
>> [ 84.217729] ---- ----
>> [ 84.222247] lock(cpuhp_state-down);
>> [ 84.225899] lock(cpu_hotplug_lock);
>> [ 84.232068] lock(cpuhp_state-down);
>> [ 84.238237] lock(dmc620_pmu_irqs_lock);
>> [ 84.242236]
>> *** DEADLOCK ***
>>
>> The problematic locking order seems to be
>>
>> lock(dmc620_pmu_irqs_lock) --> lock(cpu_hotplug_lock)
>>
>> This locking order happens when dmc620_pmu_get_irq() is called from
>> dmc620_pmu_device_probe(). Fix this possible deadlock scenario by
>> reversing the locking order.
>>
>> Also export __cpuhp_state_add_instance_cpuslocked() so that it can be
>> accessed by kernel modules.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> drivers/perf/arm_dmc620_pmu.c | 4 +++-
>> kernel/cpu.c | 1 +
>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/perf/arm_dmc620_pmu.c b/drivers/perf/arm_dmc620_pmu.c
>> index 54aa4658fb36..78d3bfbe96a6 100644
>> --- a/drivers/perf/arm_dmc620_pmu.c
>> +++ b/drivers/perf/arm_dmc620_pmu.c
>> @@ -425,7 +425,7 @@ static struct dmc620_pmu_irq *__dmc620_pmu_get_irq(int irq_num)
>> if (ret)
>> goto out_free_irq;
>>
>> - ret = cpuhp_state_add_instance_nocalls(cpuhp_state_num, &irq->node);
>> + ret = cpuhp_state_add_instance_nocalls_cpuslocked(cpuhp_state_num, &irq->node);
>> if (ret)
>> goto out_free_irq;
>>
>> @@ -445,9 +445,11 @@ static int dmc620_pmu_get_irq(struct dmc620_pmu *dmc620_pmu, int irq_num)
>> {
>> struct dmc620_pmu_irq *irq;
>>
>> + cpus_read_lock();
>> mutex_lock(&dmc620_pmu_irqs_lock);
>> irq = __dmc620_pmu_get_irq(irq_num);
>> mutex_unlock(&dmc620_pmu_irqs_lock);
>> + cpus_read_unlock();
>>
>> if (IS_ERR(irq))
>> return PTR_ERR(irq);
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 6c0a92ca6bb5..05daaef362e6 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -2036,6 +2036,7 @@ int __cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state,
>> mutex_unlock(&cpuhp_state_mutex);
>> return ret;
>> }
>> +EXPORT_SYMBOL_GPL(__cpuhp_state_add_instance_cpuslocked);
> Thanks for the fix, but I think it would be much cleaner if we could handle
> this internally to the driver without having to export additional symbols
> from the hotplug machinery.
>
> Looking at the driver, it looks like it would make more sense to register
> each PMU instance with the cpuhp state machine and avoid having to traverse
> a mutable list, rather than add irq instances.
>
> That said, I really don't grok this comment:
>
> /* We're only reading, but this isn't the place to be involving RCU */
>
> Given that perf_pmu_migrate_context() calls synchronize_rcu()...
>
> So perhaps we could just walk the list using RCU as the easiest fix?
My patch is just one of the possible fixes. I don't mind if you have a
better fix in mind. My knowledge about the internal working of the
driver is limited. So it will be great if someone more familiar with the
driver can come up with a better fix.
Thanks,
longman
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-04-11 13:45 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-05 17:28 [PATCH] perf/arm-dmc620: Reverse locking order in dmc620_pmu_get_irq() Waiman Long
2023-04-11 12:38 ` Will Deacon
2023-04-11 13:44 ` Waiman Long [this message]
2023-04-11 16:27 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a3d4bed-e59c-5728-ce92-c97c905cb0c8@redhat.com \
--to=longman@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=robin.murphy@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tuanphan@os.amperecomputing.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).