From: Kapileshwar Singh <kapileshwar.singh@arm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Javi Merino <Javi.Merino@arm.com>,
Eduardo Valentin <edubezval@gmail.com>,
Zhang Rui <rui.zhang@intel.com>,
Linux PM list <linux-pm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Punit Agrawal <Punit.Agrawal@arm.com>,
Lina Iyer <lina.iyer@linaro.org>, Mark Brown <broonie@kernel.org>,
Jon Medhurst <tixy@linaro.org>
Subject: Re: [PATCH v3 5/5] thermal: cpu_cooling: update the cpu device when cpufreq updates the policy cpu
Date: Tue, 03 Mar 2015 15:09:48 +0000 [thread overview]
Message-ID: <54F5CEBC.1070303@arm.com> (raw)
In-Reply-To: <CAKohponB3cNXT8p=SqrfpGEPjLpeE=tnzUVOiabV8hEfBMb8rQ@mail.gmail.com>
On 03/03/15 13:07, Viresh Kumar wrote:
> On 3 March 2015 at 17:11, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote:
>> Yes I indeed tested the case where we cache the device pointer of the CPU for which the OPP's are populated.
>> When this CPU is hotplugged out, it invalidates the device pointer itself. Here are the error we get in dmesg:
>
> What do you mean by 'invalidates the device pointer' ? that cpu_dev is NULL ?
The cpu_dev is not NULL but we get an erroneous OPP back. We found the problem lies in the way we calculate the frequency for the cluster.
>> <3>[67203.216774] opp_get_voltage: Invalid parameters
>> <3>[67203.326774] opp_get_voltage: Invalid parameters
>> <3>[67203.326774] opp_get_voltage: Invalid parameters
>
> Have you handwritten them ? Why don't they precede with dev_pm_* ??
I have not handwritten them, It was from a Linaro 3.10 based kernel when I first noticed this issue but the same problem exists in mainline.
Apologies for this I sent you an older trace which I had saved when I found the bug. Here is the trace I get from mainline
[ 5680.135339] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.245528] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.355432] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.465521] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.575599] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.685817] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.795556] dev_pm_opp_get_voltage: Invalid parameters
[ 5680.905598] dev_pm_opp_get_voltage: Invalid parameters
>
>>
>> Which happens because:
>>
>> unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
>> {
>> ..
>> tmp_opp = rcu_dereference(opp);
>> if (unlikely(IS_ERR_OR_NULL(tmp_opp)) || !tmp_opp->available)
>> pr_err("%s: Invalid parameters\n", __func__);
>
> This %s should print routine name ..
>
>> else
>> ..
>>
>> Which happens when
>>
>> opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
>> true);
>>
>> returns a an erroneous or NULL OPP or the opp is unavailable (in the above condition)
>
Update: This returns an erroneous OPP
> Please goto the depth of this thing, as I don't think it should happen.
>
> Over that I was asking you if you have tested the solution Javi gave,
> because OPPs
> wouldn't have been initialized for other CPUs once policy->cpu goes down.
I did test this but we were working with the assumption that OPPs should be populated for all the CPUs and also that OPPs are lost for a hotplugged CPU which I see is not the case.
We have looked at this more closely and found that problem lies in:
freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus));
which returns a NULL frequency as we are not checking for online CPUs here. We shall come up with a fix for this. Many thanks for helping us with the investigation.
Regards,
KP
next prev parent reply other threads:[~2015-03-03 15:09 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-02 17:17 [PATCH v3 0/5] Subject: The power allocator thermal governor Javi Merino
2015-03-02 17:17 ` [PATCH v3 1/5] thermal: introduce the Power Allocator governor Javi Merino
2015-03-02 17:17 ` [PATCH v3 2/5] thermal: add trace events to the power allocator governor Javi Merino
2017-03-15 4:26 ` Viresh Kumar
2015-03-02 17:17 ` [PATCH v3 3/5] thermal: export thermal_zone_parameters to sysfs Javi Merino
2015-03-02 17:17 ` [PATCH v3 4/5] Revert "cpufreq: remove CPUFREQ_UPDATE_POLICY_CPU notifications" Javi Merino
2015-03-02 17:17 ` [PATCH v3 5/5] thermal: cpu_cooling: update the cpu device when cpufreq updates the policy cpu Javi Merino
2015-03-03 4:03 ` Viresh Kumar
2015-03-03 10:59 ` Kapileshwar Singh
2015-03-03 11:19 ` Viresh Kumar
2015-03-03 11:41 ` Kapileshwar Singh
2015-03-03 13:07 ` Viresh Kumar
2015-03-03 15:09 ` Kapileshwar Singh [this message]
2015-03-03 15:26 ` Sudeep Holla
2015-03-03 15:30 ` Viresh Kumar
2015-03-03 15:33 ` Sudeep Holla
2015-03-03 15:29 ` Viresh Kumar
2015-03-03 15:34 ` Kapileshwar Singh
2015-03-02 17:28 ` [PATCH v3 0/5] Subject: The power allocator thermal governor Eduardo Valentin
2015-03-02 17:40 ` Javi Merino
2015-03-02 18:47 ` Eduardo Valentin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54F5CEBC.1070303@arm.com \
--to=kapileshwar.singh@arm.com \
--cc=Javi.Merino@arm.com \
--cc=Punit.Agrawal@arm.com \
--cc=broonie@kernel.org \
--cc=edubezval@gmail.com \
--cc=lina.iyer@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rui.zhang@intel.com \
--cc=tixy@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.