From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762226AbYBAX0r (ORCPT ); Fri, 1 Feb 2008 18:26:47 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1760897AbYBAX0f (ORCPT ); Fri, 1 Feb 2008 18:26:35 -0500 Received: from mx2.mail.elte.hu ([157.181.151.9]:49974 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760779AbYBAX0e (ORCPT ); Fri, 1 Feb 2008 18:26:34 -0500 Date: Sat, 2 Feb 2008 00:25:50 +0100 From: Ingo Molnar To: Dmitry Adamushko Cc: "Rafael J. Wysocki" , Peter Zijlstra , Steven Rostedt , Andrew Morton , Linus Torvalds , LKML Subject: Re: [Regression] 2.6.24-git3: Major annoyance during suspend/hibernation on x86-64 (bisected) Message-ID: <20080201232550.GA682@elte.hu> References: <200801280226.22013.rjw@sisk.pl> <1201795128.32654.22.camel@lappy> <200801312154.33754.rjw@sisk.pl> <1201867497.32654.49.camel@lappy> <20080201171022.GC2159@elte.hu> <20080201224852.GA16700@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-ELTE-VirusStatus: clean X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.3 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dmitry Adamushko wrote: > yeah, I was already on a half-way to check it out. > > It does fix a problem for me. > > Don't forget to take along these 2 fixes from Peter's patch: > > - fix break usage in do_each_thread() { } while_each_thread(). > - fix the hotplug switch stmt, a fall-through case was broken. Dmitry, i sent Peter's fix(es) below to Linus. Do you concur that it fixes all the practical and theoretical problems you could see with the code too? Ingo ---------------> Subject: debug: softlockup looping fix From: Peter Zijlstra Rafael J. Wysocki reported weird, multi-seconds delays during suspend/resume and bisected it back to: commit 82a1fcb90287052aabfa235e7ffc693ea003fe69 Author: Ingo Molnar Date: Fri Jan 25 21:08:02 2008 +0100 softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks fix it: - restore the old wakeup mechanism - fix break usage in do_each_thread() { } while_each_thread(). - fix the hotplug switch stmt, a fall-through case was broken. Bisected-by: Rafael J. Wysocki Signed-off-by: Peter Zijlstra Tested-by: Rafael J. Wysocki Signed-off-by: Ingo Molnar --- kernel/softlockup.c | 30 ++++++++++++++++++++---------- 1 file changed, 20 insertions(+), 10 deletions(-) Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -101,6 +101,10 @@ void softlockup_tick(void) now = get_timestamp(this_cpu); + /* Wake up the high-prio watchdog task every second: */ + if (now > (touch_timestamp + 1)) + wake_up_process(per_cpu(watchdog_task, this_cpu)); + /* Warn about unreasonable delays: */ if (now <= (touch_timestamp + softlockup_thresh)) return; @@ -191,11 +195,11 @@ static void check_hung_uninterruptible_t read_lock(&tasklist_lock); do_each_thread(g, t) { if (!--max_count) - break; + goto unlock; if (t->state & TASK_UNINTERRUPTIBLE) check_hung_task(t, now); } while_each_thread(g, t); - + unlock: read_unlock(&tasklist_lock); } @@ -218,14 +222,19 @@ static int watchdog(void *__bind_cpu) * debug-printout triggers in softlockup_tick(). */ while (!kthread_should_stop()) { + set_current_state(TASK_INTERRUPTIBLE); touch_softlockup_watchdog(); - msleep_interruptible(10000); + schedule(); + + if (kthread_should_stop()) + break; if (this_cpu != check_cpu) continue; if (sysctl_hung_task_timeout_secs) check_hung_uninterruptible_tasks(this_cpu); + } return 0; @@ -259,13 +268,6 @@ cpu_callback(struct notifier_block *nfb, wake_up_process(per_cpu(watchdog_task, hotcpu)); break; #ifdef CONFIG_HOTPLUG_CPU - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - if (!per_cpu(watchdog_task, hotcpu)) - break; - /* Unbind so it can run. Fall thru. */ - kthread_bind(per_cpu(watchdog_task, hotcpu), - any_online_cpu(cpu_online_map)); case CPU_DOWN_PREPARE: case CPU_DOWN_PREPARE_FROZEN: if (hotcpu == check_cpu) { @@ -275,6 +277,14 @@ cpu_callback(struct notifier_block *nfb, check_cpu = any_online_cpu(temp_cpu_online_map); } break; + + case CPU_UP_CANCELED: + case CPU_UP_CANCELED_FROZEN: + if (!per_cpu(watchdog_task, hotcpu)) + break; + /* Unbind so it can run. Fall thru. */ + kthread_bind(per_cpu(watchdog_task, hotcpu), + any_online_cpu(cpu_online_map)); case CPU_DEAD: case CPU_DEAD_FROZEN: p = per_cpu(watchdog_task, hotcpu);