public inbox for cpufreq@vger.kernel.org
 help / color / mirror / Atom feed
From: "Bjørn Mork" <bjorn@mork.no>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: cpufreq@vger.kernel.org, Viresh Kumar <viresh.kumar@linaro.org>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] cpufreq: try to resume policies which failed on last resume
Date: Fri, 03 Jan 2014 10:24:45 +0100	[thread overview]
Message-ID: <878uuxquxu.fsf@nemi.mork.no> (raw)
In-Reply-To: <2302938.b8tymqrMEz@vostro.rjw.lan> (Rafael J. Wysocki's message of "Fri, 03 Jan 2014 01:40:42 +0100")

"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:

> OK, thanks!  Well, this is somewhat worrisome.
>
> Could you also check the linux-pm.git/fixes branch that contains all patches
> I'm planning to push for 3.13-rc7 shortly?

It's definitely still there.  But I'm less sure about the exact trigger.

I have now booted and suspend that branch a few times trying to figure
out which actions it depends on. It seems to depend on a modified
scaling_max_freq (or any other attribute, I guess - haven't testing
anything else) in combination with suspend.  But the order doesn't
necessarily matter.  I can sometimes set off the warning by writing to
sysfs after resuming, and sometimes at suspend time if the value is
already modified.  The results are not consistent though.  Sometimes
there is no warning at all.

But what's 100% sure is that I still can trigger this with the
linux-pm.git/fixes branch.


For example, triggered by

  echo 800000 >/sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq

after resume (I believe the taint is due to a _request_firmware failure
for one of the bluetooth devices):

 
 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.13.0-rc6+ #183 Tainted: G        W   
 -------------------------------------------------------
 bash/4897 is trying to acquire lock:
  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81039f4a>] get_online_cpus+0x3c/0x50
 
 but task is already holding lock:
  (s_active#273){++++.+}, at: [<ffffffff8118a93f>] sysfs_write_file+0xd6/0x18b
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 -> #1 (s_active#273){++++.+}:
        [<ffffffff81075027>] lock_acquire+0xfb/0x144
        [<ffffffff8118b027>] sysfs_deactivate+0xa5/0x108
        [<ffffffff8118b9d7>] sysfs_addrm_finish+0x28/0x46
        [<ffffffff8118bd3f>] sysfs_remove+0x2a/0x31
        [<ffffffff8118be2f>] sysfs_remove_dir+0x66/0x6b
        [<ffffffff811d5d11>] kobject_del+0x18/0x42
        [<ffffffff811d5e1c>] kobject_cleanup+0xe1/0x14f
        [<ffffffff811d5ede>] kobject_put+0x45/0x49
        [<ffffffff812a6ed5>] cpufreq_policy_put_kobj+0x37/0x83
        [<ffffffff812a8bfb>] __cpufreq_add_dev.isra.18+0x75e/0x78c
        [<ffffffff812a8c89>] cpufreq_cpu_callback+0x53/0x88
        [<ffffffff813a314c>] notifier_call_chain+0x67/0x92
        [<ffffffff8105bce4>] __raw_notifier_call_chain+0x9/0xb
        [<ffffffff81039e7c>] __cpu_notify+0x1b/0x32
        [<ffffffff81039ea1>] cpu_notify+0xe/0x10
        [<ffffffff8103a12b>] _cpu_up+0xf1/0x124
        [<ffffffff8138ee7d>] enable_nonboot_cpus+0x52/0xbf
        [<ffffffff8107a2a3>] hibernation_snapshot+0x1be/0x2ed
        [<ffffffff8107eb84>] snapshot_ioctl+0x1e5/0x431
        [<ffffffff81139080>] do_vfs_ioctl+0x45e/0x4a8
        [<ffffffff8113911c>] SyS_ioctl+0x52/0x82
        [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b
 
 -> #0 (cpu_hotplug.lock){+.+.+.}:
        [<ffffffff81074760>] __lock_acquire+0xae3/0xe68
        [<ffffffff81075027>] lock_acquire+0xfb/0x144
        [<ffffffff8139d4d2>] mutex_lock_nested+0x6c/0x397
        [<ffffffff81039f4a>] get_online_cpus+0x3c/0x50
        [<ffffffff812a69c4>] store+0x20/0xad
        [<ffffffff8118a9a1>] sysfs_write_file+0x138/0x18b
        [<ffffffff8112a428>] vfs_write+0x9c/0x102
        [<ffffffff8112a716>] SyS_write+0x50/0x85
        [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b
 
 other info that might help us debug this:
 
  Possible unsafe locking scenario:
 
        CPU0                    CPU1
        ----                    ----
   lock(s_active#273);
                                lock(cpu_hotplug.lock);
                                lock(s_active#273);
   lock(cpu_hotplug.lock);
 
  *** DEADLOCK ***
 
 3 locks held by bash/4897:
  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811297c7>] file_start_write+0x27/0x29
  #1:  (&of->mutex){+.+.+.}, at: [<ffffffff8118a937>] sysfs_write_file+0xce/0x18b
  #2:  (s_active#273){++++.+}, at: [<ffffffff8118a93f>] sysfs_write_file+0xd6/0x18b
 
 stack backtrace:
 CPU: 1 PID: 4897 Comm: bash Tainted: G        W    3.13.0-rc6+ #183
 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
  ffffffff81ca83b0 ffff88021fa65c08 ffffffff81399cac 0000000000000006
  ffffffff81ca83b0 ffff88021fa65c58 ffffffff81397dc5 0000000000000002
  ffff8800b1af2810 0000000000000003 ffff8800b1af2ef0 0000000000000003
 Call Trace:
  [<ffffffff81399cac>] dump_stack+0x4e/0x68
  [<ffffffff81397dc5>] print_circular_bug+0x1f8/0x209
  [<ffffffff81074760>] __lock_acquire+0xae3/0xe68
  [<ffffffff81075027>] lock_acquire+0xfb/0x144
  [<ffffffff81039f4a>] ? get_online_cpus+0x3c/0x50
  [<ffffffff8139d4d2>] mutex_lock_nested+0x6c/0x397
  [<ffffffff81039f4a>] ? get_online_cpus+0x3c/0x50
  [<ffffffff81071b04>] ? __lock_is_held+0x32/0x54
  [<ffffffff81039f4a>] ? get_online_cpus+0x3c/0x50
  [<ffffffff81039f4a>] get_online_cpus+0x3c/0x50
  [<ffffffff812a69c4>] store+0x20/0xad
  [<ffffffff8118a9a1>] sysfs_write_file+0x138/0x18b
  [<ffffffff8112a428>] vfs_write+0x9c/0x102
  [<ffffffff8112a716>] SyS_write+0x50/0x85
  [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b


Or by s2disk (not the same boot - this time without the taint):

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.13.0-rc6+ #183 Not tainted
 -------------------------------------------------------
 s2disk/5123 is trying to acquire lock:
  (s_active#171){++++.+}, at: [<ffffffff8118b9d7>] sysfs_addrm_finish+0x28/0x46
 
 but task is already holding lock:
  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81039ff5>] cpu_hotplug_begin+0x28/0x50
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 -> #1 (cpu_hotplug.lock){+.+.+.}:
        [<ffffffff81075027>] lock_acquire+0xfb/0x144
        [<ffffffff8139d4d2>] mutex_lock_nested+0x6c/0x397
        [<ffffffff81039f4a>] get_online_cpus+0x3c/0x50
        [<ffffffff812a69c4>] store+0x20/0xad
        [<ffffffff8118a9a1>] sysfs_write_file+0x138/0x18b
        [<ffffffff8112a428>] vfs_write+0x9c/0x102
        [<ffffffff8112a716>] SyS_write+0x50/0x85
        [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b
 
 -> #0 (s_active#171){++++.+}:
        [<ffffffff81074760>] __lock_acquire+0xae3/0xe68
        [<ffffffff81075027>] lock_acquire+0xfb/0x144
        [<ffffffff8118b027>] sysfs_deactivate+0xa5/0x108
        [<ffffffff8118b9d7>] sysfs_addrm_finish+0x28/0x46
        [<ffffffff8118bd3f>] sysfs_remove+0x2a/0x31
        [<ffffffff8118be2f>] sysfs_remove_dir+0x66/0x6b
        [<ffffffff811d5d11>] kobject_del+0x18/0x42
        [<ffffffff811d5e1c>] kobject_cleanup+0xe1/0x14f
        [<ffffffff811d5ede>] kobject_put+0x45/0x49
        [<ffffffff812a6ed5>] cpufreq_policy_put_kobj+0x37/0x83
        [<ffffffff812a8bfb>] __cpufreq_add_dev.isra.18+0x75e/0x78c
        [<ffffffff812a8c89>] cpufreq_cpu_callback+0x53/0x88
        [<ffffffff813a314c>] notifier_call_chain+0x67/0x92
        [<ffffffff8105bce4>] __raw_notifier_call_chain+0x9/0xb
        [<ffffffff81039e7c>] __cpu_notify+0x1b/0x32
        [<ffffffff81039ea1>] cpu_notify+0xe/0x10
        [<ffffffff8103a12b>] _cpu_up+0xf1/0x124
        [<ffffffff8138ee7d>] enable_nonboot_cpus+0x52/0xbf
        [<ffffffff8107a2a3>] hibernation_snapshot+0x1be/0x2ed
        [<ffffffff8107eb84>] snapshot_ioctl+0x1e5/0x431
        [<ffffffff81139080>] do_vfs_ioctl+0x45e/0x4a8
        [<ffffffff8113911c>] SyS_ioctl+0x52/0x82
        [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b
 
 other info that might help us debug this:
 
  Possible unsafe locking scenario:
 
        CPU0                    CPU1
        ----                    ----
   lock(cpu_hotplug.lock);
                                lock(s_active#171);
                                lock(cpu_hotplug.lock);
   lock(s_active#171);
 
  *** DEADLOCK ***
 
 7 locks held by s2disk/5123:
  #0:  (pm_mutex){+.+.+.}, at: [<ffffffff8107e9ea>] snapshot_ioctl+0x4b/0x431
  #1:  (device_hotplug_lock){+.+.+.}, at: [<ffffffff812837ac>] lock_device_hotplug+0x12/0x14
  #2:  (acpi_scan_lock){+.+.+.}, at: [<ffffffff8122c6d3>] acpi_scan_lock_acquire+0x12/0x14
  #3:  (console_lock){+.+.+.}, at: [<ffffffff810817f2>] suspend_console+0x20/0x38
  #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81039fb9>] cpu_maps_update_begin+0x12/0x14
  #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81039ff5>] cpu_hotplug_begin+0x28/0x50
  #6:  (cpufreq_rwsem){.+.+.+}, at: [<ffffffff812a851c>] __cpufreq_add_dev.isra.18+0x7f/0x78c
 
 stack backtrace:
 CPU: 0 PID: 5123 Comm: s2disk Not tainted 3.13.0-rc6+ #183
 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
  ffffffff81d3de70 ffff8800b10718f8 ffffffff81399cac 0000000000003a93
  ffffffff81d3de70 ffff8800b1071948 ffffffff81397dc5 ffff8800b1071928
  ffff8800b10c8bd0 0000000000000006 ffff8800b10c9358 0000000000000006
 Call Trace:
  [<ffffffff81399cac>] dump_stack+0x4e/0x68
  [<ffffffff81397dc5>] print_circular_bug+0x1f8/0x209
  [<ffffffff81074760>] __lock_acquire+0xae3/0xe68
  [<ffffffff8107565d>] ? debug_check_no_locks_freed+0x12c/0x143
  [<ffffffff81075027>] lock_acquire+0xfb/0x144
  [<ffffffff8118b9d7>] ? sysfs_addrm_finish+0x28/0x46
  [<ffffffff81072201>] ? lockdep_init_map+0x14e/0x160
  [<ffffffff8118b027>] sysfs_deactivate+0xa5/0x108
  [<ffffffff8118b9d7>] ? sysfs_addrm_finish+0x28/0x46
  [<ffffffff8118b9d7>] sysfs_addrm_finish+0x28/0x46
  [<ffffffff8118bd3f>] sysfs_remove+0x2a/0x31
  [<ffffffff8118be2f>] sysfs_remove_dir+0x66/0x6b
  [<ffffffff811d5d11>] kobject_del+0x18/0x42
  [<ffffffff811d5e1c>] kobject_cleanup+0xe1/0x14f
  [<ffffffff811d5ede>] kobject_put+0x45/0x49
  [<ffffffff812a6ed5>] cpufreq_policy_put_kobj+0x37/0x83
  [<ffffffff812a8bfb>] __cpufreq_add_dev.isra.18+0x75e/0x78c
  [<ffffffff81071b04>] ? __lock_is_held+0x32/0x54
  [<ffffffff812a8c89>] cpufreq_cpu_callback+0x53/0x88
  [<ffffffff813a314c>] notifier_call_chain+0x67/0x92
  [<ffffffff8105bce4>] __raw_notifier_call_chain+0x9/0xb
  [<ffffffff81039e7c>] __cpu_notify+0x1b/0x32
  [<ffffffff81039ea1>] cpu_notify+0xe/0x10
  [<ffffffff8103a12b>] _cpu_up+0xf1/0x124
  [<ffffffff8138ee7d>] enable_nonboot_cpus+0x52/0xbf
  [<ffffffff8107a2a3>] hibernation_snapshot+0x1be/0x2ed
  [<ffffffff8107eb84>] snapshot_ioctl+0x1e5/0x431
  [<ffffffff81139080>] do_vfs_ioctl+0x45e/0x4a8
  [<ffffffff811414c8>] ? fcheck_files+0xa1/0xe4
  [<ffffffff81141a67>] ? fget_light+0x33/0x9a
  [<ffffffff8113911c>] SyS_ioctl+0x52/0x82
  [<ffffffff811df4ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff813a57a2>] system_call_fastpath+0x16/0x1b
 CPU1 is up
 ACPI: Waking up from system sleep state S4
 PM: noirq thaw of devices complete after 1.722 msecs
 PM: early thaw of devices complete after 1.180 msecs



Bj√∏rn

  reply	other threads:[~2014-01-03  9:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-24  1:41 [PATCH 1/2] cpufreq: try to resume policies which failed on last resume Viresh Kumar
2013-12-24  1:41 ` [PATCH 2/2] cpufreq: preserve user_policy across suspend/resume Viresh Kumar
2013-12-26  1:05 ` [PATCH 1/2] cpufreq: try to resume policies which failed on last resume Rafael J. Wysocki
2013-12-26  2:47   ` Viresh Kumar
2013-12-27  9:57     ` Viresh Kumar
2013-12-27  9:58       ` Viresh Kumar
2013-12-30 16:40         ` Bjørn Mork
2014-01-02 12:15   ` Bjørn Mork
2014-01-03  0:40     ` Rafael J. Wysocki
2014-01-03  9:24       ` Bjørn Mork [this message]
2014-01-03  9:53         ` Bjørn Mork
2014-01-03 11:19           ` Viresh Kumar
2014-01-03 11:55             ` Bjørn Mork
2014-01-06  6:27               ` Viresh Kumar
2014-01-06  9:01                 ` Bjørn Mork
2014-01-06  9:57                   ` Viresh Kumar
2014-01-06 10:49                     ` Bjørn Mork
2014-01-06 10:54                       ` Viresh Kumar
2014-01-06 11:33                         ` Rafael J. Wysocki
     [not found]                         ` <8738l15pht.fsf@nemi.mork.no>
2014-01-08  5:51                           ` Viresh Kumar
2014-01-06 11:14                       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878uuxquxu.fsf@nemi.mork.no \
    --to=bjorn@mork.no \
    --cc=cpufreq@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox