linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Andersson <bjorn.andersson@linaro.org>
To: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Andy Gross <agross@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	linux-arm-msm@vger.kernel.org, linux-pm@vger.kernel.org
Subject: Re: [PATCH 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
Date: Mon, 7 Mar 2022 13:49:00 -0800	[thread overview]
Message-ID: <YiZ9zFqz2hU8im4f@ripper> (raw)
In-Reply-To: <20220307153050.3392700-2-dmitry.baryshkov@linaro.org>

On Mon 07 Mar 07:30 PST 2022, Dmitry Baryshkov wrote:

> qcom_lmh_dcvs_poll() can be running when the cpu is being put offline.
> This results in the following warning. The driver would disable the
> worker, but it happens closer to the end of cpufreq_offline(). Change
> the locking in the qcom_lmh_dcvs_poll(), so that the worker can not run
> in parallel with cpufreq_offline() call.
> 
> [   37.122433] ------------[ cut here ]------------
> [   37.127225] WARNING: CPU: 0 PID: 187 at drivers/base/arch_topology.c:180 topology_update_thermal_pressure+0xec/0x100

I don't have a warning on line 180 in arch_topology.c

I do however believe that my proposed patch for handling the race during
initialization would end up with a warning there.

As Viresh pointed out as I tried to land those upstream, they would
cause problems as policy->cpus changes during hotplug, i.e. something
very similar to what you're showing here.

Could it be that you're testing this in a tree that has those patches?


PS. The two patches that did land upstream in the end are:
4f774c4a65bf ("cpufreq: Reintroduce ready() callback")
ef8ee1cb8fc8 ("cpufreq: qcom-hw: Delay enabling throttle_irq")

> [   37.138098] Modules linked in:
> [   37.141279] CPU: 0 PID: 187 Comm: kworker/0:3 Tainted: G S                5.17.0-rc6-00389-g37c83d0b8710-dirty #713

389 patches off mainline is quite far off from the upstream, please try
to validate your changes on something closer to mainline.

Regards,
Bjorn

> [   37.158306] Workqueue: events qcom_lmh_dcvs_poll
> [   37.163095] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   37.170278] pc : topology_update_thermal_pressure+0xec/0x100
> [   37.176131] lr : topology_update_thermal_pressure+0x20/0x100
> [   37.181977] sp : ffff800009b6bce0
> [   37.185402] x29: ffff800009b6bce0 x28: ffffd87abe92b000 x27: ffff04bd7292e205
> [   37.192792] x26: ffffd87abe930af8 x25: ffffd87abe94e4c8 x24: 0000000000000000
> [   37.200180] x23: ffff04bb01177018 x22: ffff04bb011770c0 x21: ffff04bb01177000
> [   37.207567] x20: ffff04bb0a419000 x19: 00000000000c4e00 x18: 0000000000000000
> [   37.214954] x17: 000000040044ffff x16: 004000b2b5503510 x15: 0000006aaa1326d2
> [   37.222333] x14: 0000000000000232 x13: 0000000000000001 x12: 0000000000000040
> [   37.229718] x11: ffff04bb00400000 x10: 968f57bd39f701c8 x9 : ffff04bb0acc8674
> [   37.237095] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffffd87abd90092c
> [   37.244478] x5 : 0000000000000016 x4 : 0000000000000000 x3 : 0000000000000100
> [   37.251852] x2 : ffff04bb0a419020 x1 : 0000000000000100 x0 : 0000000000000100
> [   37.259235] Call trace:
> [   37.261771]  topology_update_thermal_pressure+0xec/0x100
> [   37.267266]  qcom_lmh_dcvs_poll+0xbc/0x154
> [   37.271505]  process_one_work+0x288/0x69c
> [   37.275654]  worker_thread+0x74/0x470
> [   37.279450]  kthread+0xfc/0x100
> [   37.282712]  ret_from_fork+0x10/0x20
> [   37.286417] irq event stamp: 74
> [   37.289664] hardirqs last  enabled at (73): [<ffffd87abdd78af4>] _raw_spin_unlock_irq+0x44/0x80
> [   37.298632] hardirqs last disabled at (74): [<ffffd87abdd71fc0>] __schedule+0x710/0xa10
> [   37.306885] softirqs last  enabled at (58): [<ffffd87abcc90410>] _stext+0x410/0x588
> [   37.314778] softirqs last disabled at (51): [<ffffd87abcd1bf68>] __irq_exit_rcu+0x158/0x174
> [   37.323386] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 740518d8ae16..920c80d91c21 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -283,6 +283,23 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  	struct dev_pm_opp *opp;
>  	unsigned int freq;
>  
> +	/*
> +	 * Synchronize against CPU going offline.
> +	 * cpufreq_offline() will get the write lock on policy->rwsem.
> +	 */
> +retry:
> +	if (unlikely(!down_read_trylock(&policy->rwsem))) {
> +		mutex_lock(&data->throttle_lock);
> +		if (data->cancel_throttle) {
> +			mutex_unlock(&data->throttle_lock);
> +			return;
> +		}
> +
> +		mutex_unlock(&data->throttle_lock);
> +
> +		schedule();
> +		goto retry;
> +	}
>  	/*
>  	 * Get the h/w throttled frequency, normalize it using the
>  	 * registered opp table and use it to calculate thermal pressure.
> @@ -301,9 +318,10 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  
>  	/*
>  	 * In the unlikely case policy is unregistered do not enable
> -	 * polling or h/w interrupt
> +	 * polling or h/w interrupt.
> +	 * If we are here, we have the policy->rwsem read lock,
> +	 * cancel_throttle can be toggled only with the write lock.
>  	 */
> -	mutex_lock(&data->throttle_lock);
>  	if (data->cancel_throttle)
>  		goto out;
>  
> @@ -318,7 +336,7 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>  				 msecs_to_jiffies(10));
>  
>  out:
> -	mutex_unlock(&data->throttle_lock);
> +	up_read(&policy->rwsem);
>  }
>  
>  static void qcom_lmh_dcvs_poll(struct work_struct *work)
> -- 
> 2.34.1
> 

  reply	other threads:[~2022-03-07 21:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07 15:30 [PATCH 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Dmitry Baryshkov
2022-03-07 15:30 ` [PATCH 2/4] cpufreq: qcom-hw: fix the race between LMH worker and cpuhp Dmitry Baryshkov
2022-03-07 21:49   ` Bjorn Andersson [this message]
2022-03-09 18:45     ` Dmitry Baryshkov
2022-03-07 15:30 ` [PATCH 3/4] cpufreq: qcom-hw: fix the opp entries refcounting Dmitry Baryshkov
2022-03-07 22:16   ` Bjorn Andersson
2022-03-09 18:49     ` Dmitry Baryshkov
2022-03-07 15:30 ` [PATCH 4/4] cpufreq: qcom-hw: provide online/offline operations Dmitry Baryshkov
2022-03-07 22:40   ` Bjorn Andersson
2022-03-09 19:28     ` Dmitry Baryshkov
2022-03-07 21:51 ` [PATCH 1/4] cpufreq: qcom-hw: drop affinity hint before freeing the IRQ Bjorn Andersson
2022-03-09 18:44   ` Dmitry Baryshkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YiZ9zFqz2hU8im4f@ripper \
    --to=bjorn.andersson@linaro.org \
    --cc=agross@kernel.org \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).