[PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing
@ 2013-08-26 22:45 Stephen Boyd
  2013-08-27  6:31 ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2013-08-26 22:45 UTC (permalink / raw)
  To: Rafael J . Wysocki, Viresh Kumar; +Cc: linux-kernel, cpufreq, linux-pm


When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancel a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:

 CPU0                                        CPU1
 ----                                        ----
 cpu_down()
  ...
  __cpufreq_remove_dev()
   cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
     gov_cancel_work(dbs_data, policy);
      cpu0 work is canceled
       timer is canceled
       cpu1 work is canceled                    <work runs>
       <waits for cpu1>                         od_dbs_timer()
                                                 gov_queue_work(*, *, true);
 						  cpu0 work queued
 						  cpu1 work queued
						  cpu2 work queued
						  ...
       cpu1 work is canceled
       cpu2 work is canceled
       ...

At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:

WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G        W    3.10.0 #19
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<c0388a7c>] (debug_print_object+0x94/0xbc)
[<c0388a7c>] (debug_print_object+0x94/0xbc) from [<c0388e34>] (__debug_object_init+0x2d0/0x340)
[<c0388e34>] (__debug_object_init+0x2d0/0x340) from [<c019e3b0>] (init_timer_key+0x14/0xb0)
[<c019e3b0>] (init_timer_key+0x14/0xb0) from [<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8)
[<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8) from [<c06325a0>] (__cpufreq_governor+0xdc/0x1a4)
[<c06325a0>] (__cpufreq_governor+0xdc/0x1a4) from [<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [<c08989f4>] (cpufreq_cpu_callback+0x60/0x80)
[<c08989f4>] (cpufreq_cpu_callback+0x60/0x80) from [<c08a43c0>] (notifier_call_chain+0x38/0x68)
[<c08a43c0>] (notifier_call_chain+0x38/0x68) from [<c01938e0>] (__cpu_notify+0x28/0x40)
[<c01938e0>] (__cpu_notify+0x28/0x40) from [<c0892ad4>] (_cpu_down+0x7c/0x2c0)
[<c0892ad4>] (_cpu_down+0x7c/0x2c0) from [<c0892d3c>] (cpu_down+0x24/0x40)
[<c0892d3c>] (cpu_down+0x24/0x40) from [<c0893ea8>] (store_online+0x2c/0x74)
[<c0893ea8>] (store_online+0x2c/0x74) from [<c04519d8>] (dev_attr_store+0x18/0x24)
[<c04519d8>] (dev_attr_store+0x18/0x24) from [<c02a69d4>] (sysfs_write_file+0x100/0x148)
[<c02a69d4>] (sysfs_write_file+0x100/0x148) from [<c0255c18>] (vfs_write+0xcc/0x174)
[<c0255c18>] (vfs_write+0xcc/0x174) from [<c0255f70>] (SyS_write+0x38/0x64)
[<c0255f70>] (SyS_write+0x38/0x64) from [<c0106120>] (ret_fast_syscall+0x0/0x30)

The simplest fix is to check and see if the governor is being
stopped and ignore the all_cpus flag so that only the work that's
being canceled has the chance to re-queue itself.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---

This should probably go to stable. I think this all started happening 
in commit 031299b3be30f3ec (cpufreq: governors: Avoid unnecessary per cpu
timer interrupts, 2013-02-27).

 drivers/cpufreq/cpufreq_governor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 7b839a8..0375a3c 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -133,7 +133,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 {
 	int i;
 
-	if (!all_cpus) {
+	if (!all_cpus || !policy->governor_enabled) {
 		__gov_queue_work(smp_processor_id(), dbs_data, delay);
 	} else {
 		for_each_cpu(i, policy->cpus)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing
  2013-08-26 22:45 [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing Stephen Boyd
@ 2013-08-27  6:31 ` Viresh Kumar
  2013-08-27 18:47   ` Stephen Boyd
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2013-08-27  6:31 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael J . Wysocki, Linux Kernel Mailing List,
	cpufreq@vger.kernel.org, linux-pm@vger.kernel.org

On 27 August 2013 04:15, Stephen Boyd <sboyd@codeaurora.org> wrote:
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -133,7 +133,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>  {
>         int i;
>
> -       if (!all_cpus) {
> +       if (!all_cpus || !policy->governor_enabled) {
>                 __gov_queue_work(smp_processor_id(), dbs_data, delay);
>         } else {
>                 for_each_cpu(i, policy->cpus)

Shouldn't we simply do this instead at the top of this function?

> +       if (!policy->governor_enabled)
> +              return;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing
  2013-08-27  6:31 ` Viresh Kumar
@ 2013-08-27 18:47   ` Stephen Boyd
  2013-08-27 22:01     ` [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context Stephen Boyd
  2013-08-28  5:37     ` [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing Viresh Kumar
  0 siblings, 2 replies; 10+ messages in thread
From: Stephen Boyd @ 2013-08-27 18:47 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-kernel, cpufreq, linux-pm, Rafael J . Wysocki

When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancel a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:

 CPU0                                        CPU1
 ----                                        ----
 cpu_down()
  ...
  __cpufreq_remove_dev()
   cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
     gov_cancel_work(dbs_data, policy);
      cpu0 work is canceled
       timer is canceled
       cpu1 work is canceled                    <work runs>
       <waits for cpu1>                         od_dbs_timer()
                                                 gov_queue_work(*, *, true);
 						  cpu0 work queued
 						  cpu1 work queued
						  cpu2 work queued
						  ...
       cpu1 work is canceled
       cpu2 work is canceled
       ...

At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:

WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G        W    3.10.0 #19
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<c0388a7c>] (debug_print_object+0x94/0xbc)
[<c0388a7c>] (debug_print_object+0x94/0xbc) from [<c0388e34>] (__debug_object_init+0x2d0/0x340)
[<c0388e34>] (__debug_object_init+0x2d0/0x340) from [<c019e3b0>] (init_timer_key+0x14/0xb0)
[<c019e3b0>] (init_timer_key+0x14/0xb0) from [<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8)
[<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8) from [<c06325a0>] (__cpufreq_governor+0xdc/0x1a4)
[<c06325a0>] (__cpufreq_governor+0xdc/0x1a4) from [<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [<c08989f4>] (cpufreq_cpu_callback+0x60/0x80)
[<c08989f4>] (cpufreq_cpu_callback+0x60/0x80) from [<c08a43c0>] (notifier_call_chain+0x38/0x68)
[<c08a43c0>] (notifier_call_chain+0x38/0x68) from [<c01938e0>] (__cpu_notify+0x28/0x40)
[<c01938e0>] (__cpu_notify+0x28/0x40) from [<c0892ad4>] (_cpu_down+0x7c/0x2c0)
[<c0892ad4>] (_cpu_down+0x7c/0x2c0) from [<c0892d3c>] (cpu_down+0x24/0x40)
[<c0892d3c>] (cpu_down+0x24/0x40) from [<c0893ea8>] (store_online+0x2c/0x74)
[<c0893ea8>] (store_online+0x2c/0x74) from [<c04519d8>] (dev_attr_store+0x18/0x24)
[<c04519d8>] (dev_attr_store+0x18/0x24) from [<c02a69d4>] (sysfs_write_file+0x100/0x148)
[<c02a69d4>] (sysfs_write_file+0x100/0x148) from [<c0255c18>] (vfs_write+0xcc/0x174)
[<c0255c18>] (vfs_write+0xcc/0x174) from [<c0255f70>] (SyS_write+0x38/0x64)
[<c0255f70>] (SyS_write+0x38/0x64) from [<c0106120>] (ret_fast_syscall+0x0/0x30)

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---

On 08/27, Viresh Kumar wrote:
> On 27 August 2013 04:15, Stephen Boyd <sboyd@codeaurora.org> wrote:
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -133,7 +133,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >  {
> >         int i;
> >
> > -       if (!all_cpus) {
> > +       if (!all_cpus || !policy->governor_enabled) {
> >                 __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >         } else {
> >                 for_each_cpu(i, policy->cpus)
> 
> Shouldn't we simply do this instead at the top of this function?
> 
> > +       if (!policy->governor_enabled)
> > +              return;

Sure that works just as well. Here's a patch.

 drivers/cpufreq/cpufreq_governor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 7b839a8..b9b20fd 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -133,6 +133,9 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 {
 	int i;
 
+	if (!policy->governor_enabled)
+		return;
+
 	if (!all_cpus) {
 		__gov_queue_work(smp_processor_id(), dbs_data, delay);
 	} else {
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-27 18:47   ` Stephen Boyd
@ 2013-08-27 22:01     ` Stephen Boyd
  2013-08-28  6:34       ` Viresh Kumar
  2013-08-28  5:37     ` [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing Viresh Kumar
  1 sibling, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2013-08-27 22:01 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-kernel, cpufreq, linux-pm, Rafael J . Wysocki

Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's just use the policy->cpu argument here
instead of using smp_processor_id() to silence the warning.

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
Workqueue: events od_dbs_timer
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
[<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
[<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
[<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
[<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
[<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
[<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---

Also found this one. I'm tracking down a pretty bad hotplug/sysfs
race on 3.10. I've applied all the stable patches but I'm seeing this
still. I'll start another thread on this

WARNING: at kernel/mutex.c:341 __mutex_lock_slowpath+0x14c/0x410()
DEBUG_LOCKS_WARN_ON(l->magic != l)
Modules linked in:
CPU: 0 PID: 1960 Comm: sh Tainted: G        W    3.10.0 #32
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<c08a0334>] (__mutex_lock_slowpath+0x14c/0x410)
[<c08a0334>] (__mutex_lock_slowpath+0x14c/0x410) from [<c08a0618>] (mutex_lock+0x20/0x3c)
[<c08a0618>] (mutex_lock+0x20/0x3c) from [<c0636114>] (cpufreq_governor_dbs+0x568/0x5f8)
[<c0636114>] (cpufreq_governor_dbs+0x568/0x5f8) from [<c06325b0>] (__cpufreq_governor+0xdc/0x1a4)
[<c06325b0>] (__cpufreq_governor+0xdc/0x1a4) from [<c06328f0>] (__cpufreq_set_policy+0x278/0x2c0)
[<c06328f0>] (__cpufreq_set_policy+0x278/0x2c0) from [<c0632ea0>] (store_scaling_min_freq+0x80/0x9c)
[<c0632ea0>] (store_scaling_min_freq+0x80/0x9c) from [<c0633ae4>] (store+0x58/0x90)
[<c0633ae4>] (store+0x58/0x90) from [<c02a69d4>] (sysfs_write_file+0x100/0x148)
[<c02a69d4>] (sysfs_write_file+0x100/0x148) from [<c0255c18>] (vfs_write+0xcc/0x174)
[<c0255c18>] (vfs_write+0xcc/0x174) from [<c0255f70>] (SyS_write+0x38/0x64)
[<c0255f70>] (SyS_write+0x38/0x64) from [<c0106120>] (ret_fast_syscall+0x0/0x30)

 drivers/cpufreq/cpufreq_governor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index b9b20fd..523af48 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -137,7 +137,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 		return;
 
 	if (!all_cpus) {
-		__gov_queue_work(smp_processor_id(), dbs_data, delay);
+		__gov_queue_work(policy->cpu, dbs_data, delay);
 	} else {
 		for_each_cpu(i, policy->cpus)
 			__gov_queue_work(i, dbs_data, delay);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing
  2013-08-27 18:47   ` Stephen Boyd
  2013-08-27 22:01     ` [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context Stephen Boyd
@ 2013-08-28  5:37     ` Viresh Kumar
  1 sibling, 0 replies; 10+ messages in thread
From: Viresh Kumar @ 2013-08-28  5:37 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Linux Kernel Mailing List, cpufreq@vger.kernel.org,
	linux-pm@vger.kernel.org, Rafael J . Wysocki

On 28 August 2013 00:17, Stephen Boyd <sboyd@codeaurora.org> wrote:
> Sure that works just as well. Here's a patch.
>
>  drivers/cpufreq/cpufreq_governor.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 7b839a8..b9b20fd 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -133,6 +133,9 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>  {
>         int i;
>
> +       if (!policy->governor_enabled)
> +               return;
> +
>         if (!all_cpus) {
>                 __gov_queue_work(smp_processor_id(), dbs_data, delay);
>         } else {

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-27 22:01     ` [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context Stephen Boyd
@ 2013-08-28  6:34       ` Viresh Kumar
  2013-08-28 16:26         ` Stephen Boyd
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2013-08-28  6:34 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Linux Kernel Mailing List, cpufreq@vger.kernel.org,
	linux-pm@vger.kernel.org, Rafael J . Wysocki

On 28 August 2013 03:31, Stephen Boyd <sboyd@codeaurora.org> wrote:
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index b9b20fd..523af48 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -137,7 +137,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>                 return;
>
>         if (!all_cpus) {
> -               __gov_queue_work(smp_processor_id(), dbs_data, delay);
> +               __gov_queue_work(policy->cpu, dbs_data, delay);

This is probably wrong.. We wanted to queue work on current cpu and
not policy->cpu.. Can you use raw_smp_processor_id()?

>         } else {
>                 for_each_cpu(i, policy->cpus)
>                         __gov_queue_work(i, dbs_data, delay);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-28  6:34       ` Viresh Kumar
@ 2013-08-28 16:26         ` Stephen Boyd
  2013-08-28 21:24           ` [PATCH v2] " Stephen Boyd
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2013-08-28 16:26 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Linux Kernel Mailing List, cpufreq@vger.kernel.org,
	linux-pm@vger.kernel.org, Rafael J . Wysocki

On 08/27/13 23:34, Viresh Kumar wrote:
> On 28 August 2013 03:31, Stephen Boyd <sboyd@codeaurora.org> wrote:
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index b9b20fd..523af48 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -137,7 +137,7 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>                 return;
>>
>>         if (!all_cpus) {
>> -               __gov_queue_work(smp_processor_id(), dbs_data, delay);
>> +               __gov_queue_work(policy->cpu, dbs_data, delay);
> This is probably wrong.. We wanted to queue work on current cpu and
> not policy->cpu.. Can you use raw_smp_processor_id()?

Ah right, for the case where the policy covers more than one cpu.
raw_smp_processor_id() would work but it probably also needs a large
comment. I'll resend with that.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-28 16:26         ` Stephen Boyd
@ 2013-08-28 21:24           ` Stephen Boyd
  2013-08-29  4:20             ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2013-08-28 21:24 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-kernel, cpufreq, linux-pm, Rafael J . Wysocki

Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's use raw_smp_processor_id() here to silence
the warning.

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
Workqueue: events od_dbs_timer
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
[<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
[<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
[<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
[<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
[<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
[<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 drivers/cpufreq/cpufreq_governor.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index b9b20fd..bfbcf9a 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -137,7 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 		return;
 
 	if (!all_cpus) {
-		__gov_queue_work(smp_processor_id(), dbs_data, delay);
+		/*
+		 * Use raw_smp_processor_id() to avoid preemptible warnings.
+		 * We know that this is only called with all_cpus == false from
+		 * works that have been queued with *_work_on() functions and
+		 * those works are canceled during CPU_DOWN_PREPARE so they
+		 * can't possibly run on any other CPU.
+		 */
+		__gov_queue_work(raw_smp_processor_id(), dbs_data, delay);
 	} else {
 		for_each_cpu(i, policy->cpus)
 			__gov_queue_work(i, dbs_data, delay);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-28 21:24           ` [PATCH v2] " Stephen Boyd
@ 2013-08-29  4:20             ` Viresh Kumar
  2013-08-29 20:31               ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2013-08-29  4:20 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Linux Kernel Mailing List, cpufreq@vger.kernel.org,
	linux-pm@vger.kernel.org, Rafael J . Wysocki

On 29 August 2013 02:54, Stephen Boyd <sboyd@codeaurora.org> wrote:
> Workqueues are preemptible even if works are queued on them with
> queue_work_on(). Let's use raw_smp_processor_id() here to silence
> the warning.
>
> BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
> caller is gov_queue_work+0x28/0xb0
> CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
> Workqueue: events od_dbs_timer
> [<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
> [<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
> [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
> [<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
> [<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
> [<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
> [<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
> [<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)
>
> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> ---
>  drivers/cpufreq/cpufreq_governor.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] cpufreq: Don't use smp_processor_id() in preemptible context
  2013-08-29  4:20             ` Viresh Kumar
@ 2013-08-29 20:31               ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2013-08-29 20:31 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Stephen Boyd, Linux Kernel Mailing List, cpufreq@vger.kernel.org,
	linux-pm@vger.kernel.org

On Thursday, August 29, 2013 09:50:22 AM Viresh Kumar wrote:
> On 29 August 2013 02:54, Stephen Boyd <sboyd@codeaurora.org> wrote:
> > Workqueues are preemptible even if works are queued on them with
> > queue_work_on(). Let's use raw_smp_processor_id() here to silence
> > the warning.
> >
> > BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
> > caller is gov_queue_work+0x28/0xb0
> > CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G        W    3.10.0 #30
> > Workqueue: events od_dbs_timer
> > [<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
> > [<c0109dec>] (show_stack+0x10/0x14) from [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
> > [<c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<c0635864>] (gov_queue_work+0x28/0xb0)
> > [<c0635864>] (gov_queue_work+0x28/0xb0) from [<c0635618>] (od_dbs_timer+0x108/0x134)
> > [<c0635618>] (od_dbs_timer+0x108/0x134) from [<c01aa8f8>] (process_one_work+0x25c/0x444)
> > [<c01aa8f8>] (process_one_work+0x25c/0x444) from [<c01aaf88>] (worker_thread+0x200/0x344)
> > [<c01aaf88>] (worker_thread+0x200/0x344) from [<c01b03bc>] (kthread+0xa0/0xb0)
> > [<c01b03bc>] (kthread+0xa0/0xb0) from [<c01061b8>] (ret_from_fork+0x14/0x3c)
> >
> > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
> > ---
> >  drivers/cpufreq/cpufreq_governor.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Queued up for 3.12, thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-08-29 20:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-26 22:45 [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing Stephen Boyd
2013-08-27  6:31 ` Viresh Kumar
2013-08-27 18:47   ` Stephen Boyd
2013-08-27 22:01     ` [PATCH] cpufreq: Don't use smp_processor_id() in preemptible context Stephen Boyd
2013-08-28  6:34       ` Viresh Kumar
2013-08-28 16:26         ` Stephen Boyd
2013-08-28 21:24           ` [PATCH v2] " Stephen Boyd
2013-08-29  4:20             ` Viresh Kumar
2013-08-29 20:31               ` Rafael J. Wysocki
2013-08-28  5:37     ` [PATCH] cpufreq: Fix timer/workqueue corruption due to double queueing Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).