Re: [BUG] While changing the cpufreq governor, kernel hits a bug in workqueue.c

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@saeurebad.de>
To: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, balbir@linux.vnet.ibm.com,
	ego@linux.vnet.ibm.com, svaidy@linux.vnet.ibm.com,
	davej@codemonkey.org.uk
Subject: Re: [BUG] While changing the cpufreq governor, kernel hits a bug in workqueue.c
Date: Tue, 01 Jul 2008 16:00:34 +0200	[thread overview]
Message-ID: <87y74l4scd.fsf@skyscraper.fehenstaub.lan> (raw)
In-Reply-To: <48638906.4090308@linux.vnet.ibm.com> (Nageswara R. Sastry's message of "Thu, 26 Jun 2008 17:48:14 +0530")

Hi,

Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> writes:

> Hi,
>
> Johannes Weiner wrote:
>
>> From: Johannes Weiner <hannes@saeurebad.de>
>> Subject: cpufreq: cancel self-rearming work synchroneuously
>>
>> The ondemand and conservative governor workers are self-rearming.
>> Cancel them synchroneously to avoid nasty races.
>>
>> Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
>> Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
>> ---
>>
>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>> index 5d3a04b..78bac06 100644
>> --- a/drivers/cpufreq/cpufreq_conservative.c
>> +++ b/drivers/cpufreq/cpufreq_conservative.c
>> @@ -467,7 +467,7 @@ static inline void dbs_timer_init(void)
>>
>>  static inline void dbs_timer_exit(void)
>>  {
>> -	cancel_delayed_work(&dbs_work);
>> +	cancel_delayed_work_sync(&dbs_work);
>>  	return;
>>  }
>>
>> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
>> index d2af20d..1eb8c58 100644
>> --- a/drivers/cpufreq/cpufreq_ondemand.c
>> +++ b/drivers/cpufreq/cpufreq_ondemand.c
>> @@ -490,7 +490,7 @@ static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info)
>>  static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info)
>>  {
>>  	dbs_info->enable = 0;
>> -	cancel_delayed_work(&dbs_info->work);
>> +	cancel_delayed_work_sync(&dbs_info->work);
>>  }
>>
>>  static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>
> Applied the above patch only and compiled the kernel and seeing an
> Circular lock related issue at the time of booting. First I am
> checking this and will let you the results by applying both the
> patches.
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.25.7.cpufreq_patch #2
> -------------------------------------------------------
> S06cpuspeed/3493 is trying to acquire lock:
>  (&(&dbs_info->work)->work){--..}, at: [<c012f46c>]
> __cancel_work_timer+0x80/0x177
>
> but task is already holding lock:
>  (dbs_mutex){--..}, at: [<c041e7cb>] cpufreq_governor_dbs+0x25e/0x2ed
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (dbs_mutex){--..}:
>        [<c013aa76>] add_lock_to_list+0x61/0x83
>        [<c013cfa3>] __lock_acquire+0x953/0xb05
>        [<c041e5e1>] cpufreq_governor_dbs+0x74/0x2ed
>        [<c013d1b4>] lock_acquire+0x5f/0x79
>        [<c041e5e1>] cpufreq_governor_dbs+0x74/0x2ed
>        [<c04cdaa7>] mutex_lock_nested+0xce/0x222
>        [<c041e5e1>] cpufreq_governor_dbs+0x74/0x2ed
>        [<c041e5e1>] cpufreq_governor_dbs+0x74/0x2ed
>        [<c041e5e1>] cpufreq_governor_dbs+0x74/0x2ed
>        [<c041c87a>] __cpufreq_governor+0x73/0xa6
>        [<c041c9e8>] __cpufreq_set_policy+0x13b/0x19e
>        [<c041d6b5>] cpufreq_add_dev+0x3b4/0x4aa
>        [<c041d296>] handle_update+0x0/0x21
>        [<c02ee310>] sysdev_driver_register+0x48/0x9a
>        [<c041c75b>] cpufreq_register_driver+0x9b/0x147
>        [<c06b742c>] kernel_init+0x130/0x26f
>        [<c06b72fc>] kernel_init+0x0/0x26f
>        [<c06b72fc>] kernel_init+0x0/0x26f
>        [<c0105527>] kernel_thread_helper+0x7/0x10
>        [<ffffffff>] 0xffffffff
>
> -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){----}:
>        [<c013cfa3>] __lock_acquire+0x953/0xb05
>        [<c041d194>] lock_policy_rwsem_write+0x30/0x56
>        [<c010a83b>] save_stack_trace+0x1a/0x35
>        [<c013d1b4>] lock_acquire+0x5f/0x79
>        [<c041d194>] lock_policy_rwsem_write+0x30/0x56
>        [<c04cdfd9>] down_write+0x2b/0x44
>        [<c041d194>] lock_policy_rwsem_write+0x30/0x56
>        [<c041d194>] lock_policy_rwsem_write+0x30/0x56
>        [<c041e35e>] do_dbs_timer+0x40/0x24f
>        [<c012ee7f>] run_workqueue+0x81/0x187
>        [<c012eeba>] run_workqueue+0xbc/0x187
>        [<c012ee7f>] run_workqueue+0x81/0x187
>        [<c041e31e>] do_dbs_timer+0x0/0x24f
>        [<c012f6fa>] worker_thread+0x0/0xbd
>        [<c012f7ad>] worker_thread+0xb3/0xbd
>        [<c0131acc>] autoremove_wake_function+0x0/0x2d
>        [<c0131a1b>] kthread+0x38/0x5d
>        [<c01319e3>] kthread+0x0/0x5d
>        [<c0105527>] kernel_thread_helper+0x7/0x10
>        [<ffffffff>] 0xffffffff
>
> -> #0 (&(&dbs_info->work)->work){--..}:
>        [<c013b6a2>] print_circular_bug_tail+0x2a/0x61
>        [<c013cec8>] __lock_acquire+0x878/0xb05
>        [<c013d1b4>] lock_acquire+0x5f/0x79
>        [<c012f46c>] __cancel_work_timer+0x80/0x177
>        [<c012f497>] __cancel_work_timer+0xab/0x177
>        [<c012f46c>] __cancel_work_timer+0x80/0x177
>        [<c013c0ee>] mark_held_locks+0x39/0x53
>        [<c04cdbe8>] mutex_lock_nested+0x20f/0x222
>        [<c013c277>] trace_hardirqs_on+0xe7/0x10e
>        [<c04cdbf3>] mutex_lock_nested+0x21a/0x222
>        [<c041e7cb>] cpufreq_governor_dbs+0x25e/0x2ed
>        [<c041e7dd>] cpufreq_governor_dbs+0x270/0x2ed
>        [<c041c87a>] __cpufreq_governor+0x73/0xa6
>        [<c041c9d6>] __cpufreq_set_policy+0x129/0x19e
>        [<c041ce0b>] store_scaling_governor+0x112/0x135
>        [<c041d296>] handle_update+0x0/0x21
>        [<c0410065>] atkbd_set_leds+0x9/0xcf
>        [<c041ccf9>] store_scaling_governor+0x0/0x135
>        [<c041d7e7>] store+0x3c/0x54
>        [<c01a09a0>] sysfs_write_file+0xa9/0xdd
>        [<c01a08f7>] sysfs_write_file+0x0/0xdd
>        [<c016e412>] vfs_write+0x83/0xf6
>        [<c016e958>] sys_write+0x3c/0x63
>        [<c0104816>] sysenter_past_esp+0x5f/0xa5
>        [<ffffffff>] 0xffffffff
>
> other info that might help us debug this:
>
> 3 locks held by S06cpuspeed/3493:
>  #0:  (&buffer->mutex){--..}, at: [<c01a091b>] sysfs_write_file+0x24/0xdd
>  #1:  (&per_cpu(cpu_policy_rwsem, cpu)){----}, at: [<c041d194>]
> lock_policy_rwsem_write+0x30/0x56
>  #2:  (dbs_mutex){--..}, at: [<c041e7cb>] cpufreq_governor_dbs+0x25e/0x2ed
>
> stack backtrace:
> Pid: 3493, comm: S06cpuspeed Not tainted 2.6.25.7.cpufreq_patch #2
>  [<c013b6cf>] print_circular_bug_tail+0x57/0x61
>  [<c013cec8>] __lock_acquire+0x878/0xb05
>  [<c013d1b4>] lock_acquire+0x5f/0x79
>  [<c012f46c>] __cancel_work_timer+0x80/0x177
>  [<c012f497>] __cancel_work_timer+0xab/0x177
>  [<c012f46c>] __cancel_work_timer+0x80/0x177
>  [<c013c0ee>] mark_held_locks+0x39/0x53
>  [<c04cdbe8>] mutex_lock_nested+0x20f/0x222
>  [<c013c277>] trace_hardirqs_on+0xe7/0x10e
>  [<c04cdbf3>] mutex_lock_nested+0x21a/0x222
>  [<c041e7cb>] cpufreq_governor_dbs+0x25e/0x2ed
>  [<c041e7dd>] cpufreq_governor_dbs+0x270/0x2ed
>  [<c041c87a>] __cpufreq_governor+0x73/0xa6
>  [<c041c9d6>] __cpufreq_set_policy+0x129/0x19e
>  [<c041ce0b>] store_scaling_governor+0x112/0x135
>  [<c041d296>] handle_update+0x0/0x21
>  [<c0410065>] atkbd_set_leds+0x9/0xcf
>  [<c041ccf9>] store_scaling_governor+0x0/0x135
>  [<c041d7e7>] store+0x3c/0x54
>  [<c01a09a0>] sysfs_write_file+0xa9/0xdd
>  [<c01a08f7>] sysfs_write_file+0x0/0xdd
>  [<c016e412>] vfs_write+0x83/0xf6
>  [<c016e958>] sys_write+0x3c/0x63
>  [<c0104816>] sysenter_past_esp+0x5f/0xa5
>  =======================

Okay, the problem is in cpufreq_conservative.c.  We
cancel_delayed_work_sync() while holding the mutex, but the work itself
tries to grab it and there it deadlocks; lockdep caught that right.

The hunk for _ondemand is correct, but the one for _conservative is
obviously wrong, sorry :/

I will whip something up and get back to you.  Thanks a lot for testing!

	Hannes

next prev parent reply	other threads:[~2008-07-01 14:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-23 10:51 [BUG] While changing the cpufreq governor, kernel hits a bug in workqueue.c Nageswara R Sastry
2008-06-23 15:26 ` Johannes Weiner
2008-06-24  9:17   ` Nageswara R Sastry
2008-06-25 19:47     ` Johannes Weiner
2008-06-25 20:00       ` Dave Jones
2008-06-26 12:18       ` Nageswara R Sastry
2008-06-26 13:31         ` Nageswara R Sastry
2008-06-27  4:12           ` Nageswara R Sastry
2008-07-01 14:00         ` Johannes Weiner [this message]
2008-07-04 13:56           ` Johannes Weiner
2008-07-07  9:48             ` Nageswara R Sastry
2008-07-07 11:07               ` Johannes Weiner
2008-07-08  5:52                 ` Nageswara R Sastry
2008-07-10 11:11                   ` Johannes Weiner
2008-07-15  3:42                     ` Nageswara R Sastry
2008-07-16 13:44                     ` Peter Zijlstra
2008-08-12  8:12                       ` Nageswara R Sastry
2008-08-12 21:29                         ` Johannes Weiner
2008-08-12 21:44                           ` Johannes Weiner
2008-10-07  9:41                             ` Nageswara R Sastry
2008-10-28  3:29                               ` Nageswara R Sastry
2008-07-07 11:19               ` Nageswara R Sastry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y74l4scd.fsf@skyscraper.fehenstaub.lan \
    --to=hannes@saeurebad.de \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=davej@codemonkey.org.uk \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rnsastry@linux.vnet.ibm.com \
    --cc=svaidy@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox