All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prarit Bhargava <prarit@redhat.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linux-kernel@vger.kernel.org, robert.schoene@tu-dresden.de,
	sboyd@codeaurora.org, Viresh Kumar <viresh.kumar@linaro.org>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH 5/5] cpufreq, add BUG() messages in critical paths to aid debugging failures
Date: Sun, 09 Nov 2014 09:12:42 -0500	[thread overview]
Message-ID: <545F765A.3040007@redhat.com> (raw)
In-Reply-To: <1874335.TzunrgXvuB@vostro.rjw.lan>



On 11/08/2014 04:46 PM, Rafael J. Wysocki wrote:
> On Saturday, November 08, 2014 08:33:35 AM Prarit Bhargava wrote:
>>
>> On 11/07/2014 09:00 PM, Rafael J. Wysocki wrote:
>>> On Wednesday, November 05, 2014 09:53:59 AM Prarit Bhargava wrote:
>>>> Add some additional debug to capture failures in the locking scheme for
>>>> cpufreq.  Instead of just a NULL pointer, these warnings will capture failure
>>>> points if the locking scheme for cpufreq is broken.
>>>>
>>>> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>> Cc: linux-pm@vger.kernel.org
>>>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>>>> ---
>>>>  drivers/cpufreq/cpufreq_governor.c |   32 +++++++++++++++++++++++++++-----
>>>>  1 file changed, 27 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>> index b1ee597..f158882 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>> @@ -161,9 +161,18 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>>>  EXPORT_SYMBOL_GPL(dbs_check_cpu);
>>>>  
>>>>  static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>>> -		unsigned int delay)
>>>> +				    unsigned int delay,
>>>> +				    struct cpufreq_policy *policy)
>>>>  {
>>>> -	struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>>> +	struct cpu_dbs_common_info *cdbs;
>>>> +
>>>> +	if (!dbs_data->cdata) {
>>>> +		pr_emerg("common_dbs_data is NULL for %s but initialized = %d",
>>>> +			 policy->governor->name,
>>>> +			 atomic_read(&policy->governor->initialized));
>>>> +		BUG();
>>>
>>> Is it necessary to crash the kernel here?
>>
>> Yes.  dbs_data->cdata is referenced right below.
>>
>>>
>>>> +	}
>>>> +	cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>
>> and we'll NULL pointer panic right here without any of the debug info above :(
> 
> Can we possibly avoid the panic?
> 
> Adding BUG() instead of a NULL pointer deref is not much improvement.

(sorry  for the lowercase typing.  i fractured my elbow and have resorted to
single hand typing ....)

rafael, i understand your concern and my description is clearly lacking for this
patch.  this patch is not meant to be a fix but is meant to capture debug info
for future issues in this code.  i thought about only doing the pr_emerg() but
that results in situations where other threads may continue processing and i
lose state in crashdump :(.   bug() is a good idea here imo.

P.
> 
>>
>>>>  
>>>>  	mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>>>  }
>>>> @@ -185,10 +194,11 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>>  		 * those works are canceled during CPU_DOWN_PREPARE so they
>>>>  		 * can't possibly run on any other CPU.
>>>>  		 */
>>>> -		__gov_queue_work(raw_smp_processor_id(), dbs_data, delay);
>>>> +		__gov_queue_work(raw_smp_processor_id(), dbs_data, delay,
>>>> +				 policy);
>>>>  	} else {
>>>>  		for_each_cpu(i, policy->cpus)
>>>> -			__gov_queue_work(i, dbs_data, delay);
>>>> +			__gov_queue_work(i, dbs_data, delay, policy);
>>>>  	}
>>>>  
>>>>  out_unlock:
>>>> @@ -258,7 +268,13 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>>  	else
>>>>  		dbs_data = cdata->gdbs_data;
>>>>  
>>>> -	WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT));
>>>> +	if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
>>>> +		pr_emerg("governor_data is NULL but governor %s is initialized = %d [governor_enabled = %d event = %u]\n",
>>>> +			 policy->governor->name,
>>>> +			 atomic_read(&policy->governor->initialized),
>>>> +			 policy->governor_enabled, event);
>>>> +		BUG();
>>>
>>> And here?
>>>
>>
>> Ditto -- dbs_data is dereferenced in the call path and will NULL pointer panic.
>>
>> P.
>>
>>>> +	}
>>>>  
>>>>  	switch (event) {
>>>>  	case CPUFREQ_GOV_POLICY_INIT:
>>>> @@ -329,6 +345,12 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>>  	case CPUFREQ_GOV_POLICY_EXIT:
>>>>  		mutex_lock(&dbs_data->usage_count_mutex);
>>>>  		if (atomic_dec_and_test(&dbs_data->usage_count)) {
>>>> +			if (atomic_read(&policy->governor->initialized) > 1) {
>>>> +				pr_emerg("Removing governor %s but initialized = %d, dbs_data->usage_count = 0\n",
>>>> +					 policy->governor->name,
>>>> +				   atomic_read(&policy->governor->initialized));
>>>> +				BUG();
>>>> +			}
>>>>  			sysfs_remove_group(get_governor_parent_kobj(policy),
>>>>  					get_sysfs_attr(dbs_data));
>>>>  
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2014-11-09 14:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-05 14:53 [PATCH 0/5] cpufreq, fix locking and data issues Prarit Bhargava
2014-11-05 14:53 ` [PATCH 1/5] cpufreq, do not return stale data to userspace Prarit Bhargava
2014-11-05 14:53 ` [PATCH 2/5] cpufreq, fix locking around CPUFREQ_GOV_POLICY_EXIT calls Prarit Bhargava
2014-11-10 10:44   ` Viresh Kumar
2014-11-10 12:26     ` Prarit Bhargava
2014-11-11  3:37       ` Viresh Kumar
2014-11-11 12:15         ` Prarit Bhargava
2014-11-11 13:07           ` Viresh Kumar
2014-11-13 21:58             ` Saravana Kannan
2014-11-05 14:53 ` [PATCH 3/5] cpufreq, dbs_data->usage count must be atomic Prarit Bhargava
2014-11-08  1:57   ` Rafael J. Wysocki
2014-11-11  3:40   ` Viresh Kumar
2014-11-05 14:53 ` [PATCH 4/5] cpufreq, policy->initialized " Prarit Bhargava
2014-11-08  1:59   ` Rafael J. Wysocki
2014-11-11  3:55   ` Viresh Kumar
2014-11-05 14:53 ` [PATCH 5/5] cpufreq, add BUG() messages in critical paths to aid debugging failures Prarit Bhargava
2014-11-08  2:00   ` Rafael J. Wysocki
2014-11-08 13:33     ` Prarit Bhargava
2014-11-08 21:46       ` Rafael J. Wysocki
2014-11-09 14:12         ` Prarit Bhargava [this message]
2014-11-11  4:23   ` Viresh Kumar
2014-11-11 12:18     ` Prarit Bhargava
2014-11-11 13:11       ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545F765A.3040007@redhat.com \
    --to=prarit@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robert.schoene@tu-dresden.de \
    --cc=sboyd@codeaurora.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.