From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Meelis Roos <mroos@linux.ee>,
"cpufreq@vger.kernel.org" <cpufreq@vger.kernel.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 5/5] cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end
Date: Tue, 29 Apr 2014 11:38:24 +0530 [thread overview]
Message-ID: <535F41D8.90002@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAKohpomaJ0ubnzzQ7AwAd6z+Zm085i+UwB7XMa=si7qthZdgZg@mail.gmail.com>
On 04/29/2014 10:21 AM, Viresh Kumar wrote:
> Nice effort.
>
Thanks! :-)
> On 29 April 2014 00:25, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Now all such drivers have been fixed, but debugging this issue was not
>> very straight-forward (even lockdep didn't catch this). So let us add a
>> debug infrastructure to the cpufreq core to catch such issues more easily
>> in the future.
>
> BUT, I am not sure if we really need it :(
>
> I think we just got into the 'barrier' stuff as we had some doubts about it
> earlier and were quite sure that nothing else could go wrong. Otherwise
> the only problem could have been present was the second queuing
> from the same thread. And we will surely get stuck if that happens and
> we can't just miss that error..
>
Yeah, and we _did_ hit that hang, but it was not at all intuitive at first
as to what was going wrong. Worse, even lockdep is not in a position to catch
such scenarios. So it definitely doesn't hurt to add a small infrastructure
to catch such issues in the future, IMHO.
Besides, if we can add features for users, surely we can also add some
non-intrusive debug code for ourselves too, to make our lives easier, right? :-)
I'm sure we deserve that privilege ;-)
Regards,
Srivatsa S. Bhat
>> Scenario 1: (Deadlock-free)
>> ----------
>>
>> Task A Task B
>>
>> /* 1st freq transition */
>> Invoke _begin() {
>> ...
>> ...
>> }
>>
>> Change the frequency
>>
>> Got interrupt for successful
>> change of frequency.
>>
>> /* 1st freq transition */
>> Invoke _end() {
>> ...
>> ...
>> /* 2nd freq transition */ ...
>> Invoke _begin() { ...
>> ... //waiting for B ...
>> ... //to finish _end() }
>> ...
>> ...
>> }
>>
>>
>> This scenario is actually deadlock-free because Task A can wait inside the
>> second call to _begin() without self-deadlocking, because it is the
>> responsibility of Task B to finish the first sequence by invoking the
>> corresponding _end().
>>
>> By setting the value of 'transition_task' again explicitly in _end(), we
>> ensure that the code won't print a false-positive warning in this case.
>>
>> However the same code successfully catches the following deadlock-prone
>> scenario even in ASYNC_NOTIFICATION drivers:
>>
>> Scenario 2: (Deadlock-prone)
>> ----------
>>
>> Task A Task B
>>
>> /* 1st freq transition */
>> Invoke _begin() {
>> ...
>> ...
>> }
>>
>> /* 2nd freq transition */
>> Invoke _begin() {
>> ...
>> ...
>> }
>>
>> Change the frequency
>>
>>
>> Here the bug is that Task A called the second _begin() *before* actually
>> performing the 1st frequency transition. In other words, it failed to set
>> Task B in motion for the 1st frequency transition, and hence it will
>> self-deadlock. This is very similar to the case of drivers which do
>> synchronous notification, and hence the debug infrastructure developed
>> in this patch can catch this scenario easily.
>>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> ---
>>
>> drivers/cpufreq/cpufreq.c | 12 ++++++++++++
>> include/linux/cpufreq.h | 1 +
>> 2 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index abda660..2c99a6c 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -354,6 +354,10 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
>> void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
>> struct cpufreq_freqs *freqs)
>> {
>> +
>> + /* Catch double invocations of _begin() which lead to self-deadlock */
>> + WARN_ON(current == policy->transition_task);
>> +
>> wait:
>> wait_event(policy->transition_wait, !policy->transition_ongoing);
>>
>> @@ -365,6 +369,7 @@ wait:
>> }
>>
>> policy->transition_ongoing = true;
>> + policy->transition_task = current;
>>
>> spin_unlock(&policy->transition_lock);
>>
>> @@ -378,9 +383,16 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
>> if (unlikely(WARN_ON(!policy->transition_ongoing)))
>> return;
>>
>> + /*
>> + * The task invoking _end() could be different from the one that
>> + * invoked the _begin(). So set ->transition_task again here
>> + * explicity.
>> + */
>> + policy->transition_task = current;
>> cpufreq_notify_post_transition(policy, freqs, transition_failed);
>>
>> policy->transition_ongoing = false;
>> + policy->transition_task = NULL;
>>
>> wake_up(&policy->transition_wait);
>> }
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 5ae5100..8f44d79 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -110,6 +110,7 @@ struct cpufreq_policy {
>> bool transition_ongoing; /* Tracks transition status */
>> spinlock_t transition_lock;
>> wait_queue_head_t transition_wait;
>> + struct task_struct *transition_task; /* Task which is doing the transition */
>> };
>>
>> /* Only for ACPI */
>>
next prev parent reply other threads:[~2014-04-29 6:08 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-28 18:53 [PATCH v2 0/5] Cpufreq frequency serialization fixes Srivatsa S. Bhat
2014-04-28 18:54 ` [PATCH v2 1/5] cpufreq, longhaul: Fix double invocation of cpufreq_freq_transition_begin/end Srivatsa S. Bhat
2014-04-28 18:54 ` [PATCH v2 2/5] cpufreq, powernow-k6: Fix incorrect comparison with max_multipler Srivatsa S. Bhat
2014-04-28 18:54 ` [PATCH v2 3/5] cpufreq, powernow-k6: Fix double invocation of cpufreq_freq_transition_begin/end Srivatsa S. Bhat
2014-04-28 18:54 ` [PATCH v2 4/5] cpufreq, powernow-k7: " Srivatsa S. Bhat
2014-04-28 18:55 ` [PATCH v2 5/5] cpufreq: Catch double invocations " Srivatsa S. Bhat
2014-04-29 4:51 ` Viresh Kumar
2014-04-29 4:55 ` Viresh Kumar
2014-04-29 6:16 ` Srivatsa S. Bhat
2014-04-29 6:49 ` Viresh Kumar
2014-04-29 7:35 ` Srivatsa S. Bhat
2014-04-29 8:04 ` Viresh Kumar
2014-04-29 8:10 ` Srivatsa S. Bhat
2014-04-29 6:08 ` Srivatsa S. Bhat [this message]
2014-04-28 20:57 ` [PATCH v2 0/5] Cpufreq frequency serialization fixes Meelis Roos
2014-04-28 23:47 ` Rafael J. Wysocki
2014-04-29 5:59 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=535F41D8.90002@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=cpufreq@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mroos@linux.ee \
--cc=rjw@rjwysocki.net \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).