All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Freezer, CPU hotplug, x86 Microcode: Fix task freezing failures
@ 2011-10-10 12:31 Srivatsa S. Bhat
  2011-10-10 12:32 ` [PATCH v2 1/3] Introduce helper functions Srivatsa S. Bhat
                   ` (3 more replies)
  0 siblings, 4 replies; 40+ messages in thread
From: Srivatsa S. Bhat @ 2011-10-10 12:31 UTC (permalink / raw)
  To: rjw
  Cc: srivatsa.bhat, bp, pavel, len.brown, tj, mingo, a.p.zijlstra,
	akpm, suresh.b.siddha, lucas.demarchi, rusty, rdunlap, vatsa,
	ashok.raj, tigran, tglx, hpa, linux-pm, linux-kernel, linux-doc

When CPU hotplug is run along with suspend/hibernate tests using
the pm_test framework, even at the freezer level, we hit task freezing
failures. One such failure was reported here:
https://lkml.org/lkml/2011/9/5/28

An excerpt of the log:

  Freezing of tasks failed after 20.01 seconds (2 tasks refusing to
  freeze, wq_busy=0):
  invert_cpu_stat D 0000000000000000  5304 20435  17329 0x00000084
   ffff8801f367bab8 0000000000000046 ffff8801f367bfd8 00000000001d3a00
   ffff8801f367a010 00000000001d3a00 00000000001d3a00 00000000001d3a00
   ffff8801f367bfd8 00000000001d3a00 ffff880414cc6840 ffff8801f36783c0
  Call Trace:
   [<ffffffff81532de5>] schedule_timeout+0x235/0x320
   [<ffffffff81532a0b>] wait_for_common+0x11b/0x170
   [<ffffffff81532b3d>] wait_for_completion+0x1d/0x20
   [<ffffffff81364486>] _request_firmware+0x156/0x2c0
   [<ffffffff81364686>] request_firmware+0x16/0x20
   [<ffffffffa01f0da0>] request_microcode_fw+0x70/0xf0 [microcode]
   [<ffffffffa01f0390>] microcode_init_cpu+0xc0/0x100 [microcode]
   [<ffffffffa01f14b4>] mc_cpu_callback+0x7c/0x11f [microcode]
   [<ffffffff815393a4>] notifier_call_chain+0x94/0xd0
   [<ffffffff8109770e>] __raw_notifier_call_chain+0xe/0x10
   [<ffffffff8106d000>] __cpu_notify+0x20/0x40
   [<ffffffff8152cf5b>] _cpu_up+0xc7/0x10e
   [<ffffffff8152d07b>] cpu_up+0xd9/0xec
   [<ffffffff8151e599>] store_online+0x99/0xd0
   [<ffffffff81355eb0>] sysdev_store+0x20/0x30
   [<ffffffff811f3096>] sysfs_write_file+0xe6/0x170
   [<ffffffff8117ee50>] vfs_write+0xd0/0x1a0
   [<ffffffff8117f024>] sys_write+0x54/0xa0
   [<ffffffff8153df02>] system_call_fastpath+0x16/0x1b


The reason behind this failure is explained below:

The x86 microcode update driver has callbacks registered for CPU hotplug
events such as a CPU getting offlined or onlined. Things go wrong when a
CPU hotplug stress test is carried out along with a suspend/resume operation
running simultaneously. Upon getting a CPU_DEAD notification (for example,
when a CPU offline occurs with tasks not frozen), the microcode callback
frees up the microcode and invalidates it. Later, when that CPU gets onlined
with tasks being frozen, the microcode callback (for the CPU_ONLINE_FROZEN
event) tries to apply the microcode to the CPU; doesn't find it and hence
depends on the (currently frozen) userspace to get the microcode again. This
leads to the numerous "WARNING"s at drivers/base/firmware_class.c which
eventually leads to task freezing failures in the suspend code path, as has
been reported.

So, this patch series addresses this issue by ensuring that CPU hotplug and
suspend/hibernate don't run in parallel, thereby fixing the task freezing
failures.

v2: Implemented mutual exclusion between CPU hotplug and suspend/hibernate.

Srivatsa S. Bhat (3):
      Introduce helper functions
      Mutually exclude cpu online and suspend/hibernate
      Update documentation

 Documentation/power/freezing-of-tasks.txt |   22 ++++++++++++++++++++++
 include/linux/suspend.h                   |   21 +++++++++++++++++++--
 kernel/cpu.c                              |   10 ++++++++++
 3 files changed, 51 insertions(+), 2 deletions(-)



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2011-10-19 17:30 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-10 12:31 [PATCH v2 0/3] Freezer, CPU hotplug, x86 Microcode: Fix task freezing failures Srivatsa S. Bhat
2011-10-10 12:32 ` [PATCH v2 1/3] Introduce helper functions Srivatsa S. Bhat
2011-10-10 12:33 ` [PATCH v2 2/3] Mutually exclude cpu online and suspend/hibernate Srivatsa S. Bhat
2011-10-10 12:45   ` Srivatsa S. Bhat
2011-10-10 14:26     ` Peter Zijlstra
2011-10-10 15:16       ` Srivatsa S. Bhat
2011-10-11 20:32         ` Srivatsa S. Bhat
2011-10-11 21:56           ` Rafael J. Wysocki
2011-10-12  3:57             ` Srivatsa S. Bhat
2011-10-12 19:31               ` Rafael J. Wysocki
2011-10-12 21:25                 ` Srivatsa S. Bhat
2011-10-12 22:09                   ` Rafael J. Wysocki
2011-10-13 15:42                     ` Srivatsa S. Bhat
2011-10-13 16:06                       ` Tejun Heo
2011-10-13 17:01                         ` Borislav Petkov
2011-10-13 17:29                           ` Srivatsa S. Bhat
2011-10-19 17:29                             ` Srivatsa S. Bhat
2011-10-13 18:03                           ` Alan Stern
2011-10-13 19:07                             ` Rafael J. Wysocki
2011-10-13 19:08                         ` Rafael J. Wysocki
2011-10-10 15:25       ` Alan Stern
2011-10-10 17:00     ` Tejun Heo
2011-10-11  9:18       ` Peter Zijlstra
2011-10-11  9:37         ` Srivatsa S. Bhat
2011-10-10 12:33 ` [PATCH v2 3/3] Update documentation Srivatsa S. Bhat
2011-10-10 15:23 ` [PATCH v2 0/3] Freezer, CPU hotplug, x86 Microcode: Fix task freezing failures Alan Stern
2011-10-10 15:32   ` Srivatsa S. Bhat
2011-10-10 16:53     ` Borislav Petkov
2011-10-10 17:14       ` Pavel Machek
2011-10-10 17:30       ` Srivatsa S. Bhat
2011-10-10 17:53         ` Borislav Petkov
2011-10-10 18:08           ` tj
2011-10-10 18:34             ` Borislav Petkov
2011-10-10 18:45               ` Srivatsa S. Bhat
2011-10-10 18:53               ` tj
2011-10-10 19:00                 ` Srivatsa S. Bhat
2011-10-10 20:35                   ` Borislav Petkov
     [not found]                 ` <20111010202913.GA30798@aftab>
2011-10-10 21:13                   ` tj
2011-10-11  9:17       ` Peter Zijlstra
2011-10-10 16:57   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.