From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>,
tigran@aivazian.fsnet.co.uk, tglx@linutronix.de, mingo@elte.hu,
hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org,
Linux PM mailing list <linux-pm@lists.linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUGFIX][PATCH] Freezer, CPU hotplug, x86 Microcode: Fix task freezing failures
Date: Sun, 2 Oct 2011 21:36:41 +0200 [thread overview]
Message-ID: <201110022136.41726.rjw@sisk.pl> (raw)
In-Reply-To: <4E88B5E0.6080503@linux.vnet.ibm.com>
Hi,
Thanks for the fix.
On Sunday, October 02, 2011, Srivatsa S. Bhat wrote:
> This patch addresses the warnings found in the logs in the
> task freezing failure bug reported in https://lkml.org/lkml/2011/9/5/28
>
> The warnings appear because of the reason explained below:
>
> There are microcode callbacks registered for CPU hotplug events such
> as a CPU getting offlined or onlined. When a CPU is offlined
> with tasks being frozen (as in the case of disabling the non-boot CPUs
> while preparing for a system suspend operation), the CPU_DEAD_FROZEN
> notification is sent, for which the microcode callback does not
> do anything. In particular, it does not free or invalidate the CPU
> microcode which it had got from userspace earlier. Hence when that CPU
> comes back online with tasks being frozen (as in the case of re-enabling
> the non-boot CPUs during a resume operation after suspend), the microcode
> callback applies the microcode (which it already possesses) to that CPU.
>
> However, during a pure CPU hotplug operation, tasks are not frozen and
> hence the CPU_DEAD notification is sent. Upon this event notification,
> the microcode callback frees the copy of microcode it has and
> invalidates it. And during a CPU online, it tries to apply the microcode
> to the CPU, but since it doesn't have the copy of the microcode, it depends
> on a userspace utility to get the microcode. This is perfectly fine when
> doing plain CPU hotplug operations alone.
>
> Things go wrong when a CPU hotplug stress test is carried out along with
> a suspend/resume operation running simultaneously. Upon getting a CPU_DEAD
> notification (for example, when a CPU offline occurs with tasks not frozen),
> the microcode callback frees up the microcode and invalidates it. Later
> when that CPU gets onlined with tasks being frozen, the microcode callback
> (for the CPU_ONLINE_FROZEN event) tries to apply the microcode to the CPU;
> doesn't find it and hence depends on the (currently frozen) userspace to
> get the microcode again. This leads to the numerous "WARNING"s at
> drivers/base/firmware_class.c which eventually leads to task freezing failures
> in the suspend code path, as has been reported.
>
> So, this patch addresses this issue by ensuring that microcode is not freed
> from kernel memory, nor invalidated when a CPU goes offline. Thus once the
> kernel gets the microcode during boot-up, it will never have to depend on
> userspace ever again to get microcode, since it never releases the copy it
> already has. So every run of the microcode callback for CPU online event will
> now succeed irrespective of whether userspace is frozen or not. As a result,
> this fixes the task freezing failure encountered while running CPU hotplug
> stress test along with suspend/resume operations simultaneously.
>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
Thanks for the fix. I'd like to push it for 3.2 and possibly -stable.
Does anyone have any objections?
Rafael
> arch/x86/kernel/microcode_core.c | 10 +++++++++-
> 1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/microcode_core.c b/arch/x86/kernel/microcode_core.c
> index f924280..cd7ef2f 100644
> --- a/arch/x86/kernel/microcode_core.c
> +++ b/arch/x86/kernel/microcode_core.c
> @@ -483,7 +483,15 @@ mc_cpu_callback(struct notifier_block *nb, unsigned long action, void *hcpu)
> sysfs_remove_group(&sys_dev->kobj, &mc_attr_group);
> pr_debug("CPU%d removed\n", cpu);
> break;
> - case CPU_DEAD:
> +
> + /*
> + * Do not invalidate the microcode if a CPU goes offline,
> + * because it would be impossible to get the microcode again
> + * from userspace when the CPU comes back up, if the userspace
> + * happens to be frozen at that moment by the freezer subsystem,
> + * for example, due to a suspend operation in progress.
> + */
> +
> case CPU_UP_CANCELED_FROZEN:
> /* The CPU refused to come up during a system resume */
> microcode_fini_cpu(cpu);
>
>
>
>
next prev parent reply other threads:[~2011-10-02 19:34 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-02 19:05 [BUGFIX][PATCH] Freezer, CPU hotplug, x86 Microcode: Fix task freezing failures Srivatsa S. Bhat
2011-10-02 19:36 ` Rafael J. Wysocki [this message]
2011-10-02 19:50 ` Tejun Heo
2011-10-02 20:04 ` Srivatsa S. Bhat
2011-10-03 0:40 ` Tejun Heo
2011-10-03 5:51 ` Srivatsa S. Bhat
2011-10-03 8:47 ` Borislav Petkov
2011-10-04 7:15 ` Tejun Heo
2011-10-04 13:15 ` Srivatsa S. Bhat
2011-10-04 13:46 ` Borislav Petkov
2011-10-04 17:14 ` Borislav Petkov
2011-10-04 19:49 ` Rafael J. Wysocki
2011-10-04 20:57 ` Srivatsa S. Bhat
2011-10-05 7:21 ` Borislav Petkov
2011-10-05 8:51 ` Srivatsa S. Bhat
2011-10-05 20:26 ` Rafael J. Wysocki
2011-10-05 21:15 ` Srivatsa S. Bhat
2011-10-05 22:43 ` Rafael J. Wysocki
2011-10-06 6:50 ` Srivatsa S. Bhat
2011-10-06 8:34 ` Borislav Petkov
2011-10-06 15:47 ` Srivatsa S. Bhat
2011-10-06 18:11 ` Srivatsa S. Bhat
2011-10-06 20:35 ` [BUGFIX][PATCH RESEND] " Srivatsa S. Bhat
2011-10-06 22:13 ` Tejun Heo
2011-10-06 22:34 ` Borislav Petkov
2011-10-07 16:48 ` Srivatsa S. Bhat
2011-10-07 18:05 ` Borislav Petkov
2011-10-04 13:25 ` [BUGFIX][PATCH] " Borislav Petkov
2011-10-05 8:33 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201110022136.41726.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=mingo@elte.hu \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=tigran@aivazian.fsnet.co.uk \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.