From: Gautham R Shenoy <ego@in.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Slaby <jirislaby@gmail.com>,
linux-kernel@vger.kernel.org,
Linux-pm mailing list <linux-pm@lists.linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]
Date: Mon, 10 Dec 2007 16:38:18 +0530 [thread overview]
Message-ID: <20071210110818.GC12880@in.ibm.com> (raw)
In-Reply-To: <20071210102157.GB31103@elte.hu>
On Mon, Dec 10, 2007 at 11:21:57AM +0100, Ingo Molnar wrote:
>
> * Gautham R Shenoy <ego@in.ibm.com> wrote:
>
> > > i'm wondering, what's the proper CPU-hotplug safe sequence here
> > > then? I'm picking a CPU number from cpu_online_map, and that CPU
> > > could go away while i'm still using it, right? What's saving us
> > > here?
> >
> > In this particular case, we are trying to see if any task on a
> > particular cpu has not been scheduled for a really long time. If we do
> > this check on a cpu which has gone offline, then a) If the tasks have
> > not been migrated on to another cpu yet, we will still perform that
> > check and yell if something has been holding any task for a
> > sufficiently long time. b) If the tasks have been migrated off, then
> > we have nothing to check.
>
> say we've got 100 CPUs, so we've got 100 watchdog tasks running - one
> for each CPU. Checking for hung tasks is a global operation not a
> per-CPU operation (we iterate over the global tasklist), hence only one
> CPU should really be calling this function. That online-cpus logic
> achieves this by picking a single CPU. Perhaps it would be better to
> keep a hung_task_checker_cpu variable that is driven from a
> CPU-hotplug-down notifier? That way if a CPU is brought down we can
> update hung_task_checker_cpu to another, still-online CPU. (this would
> also be faster, because event-driven)
Do you mean something like this?
From: Gautham R Shenoy <ego@in.ibm.com>
softlockup: update check_cpu during cpu-hotplug
Update the check_cpu value during a cpu-hotplug operation
so that we don't check for hung tasks on a cpu which is about
to go offline.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linuxtronix.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
---
kernel/softlockup.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index 576eb9c..b1a8c7c 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -194,6 +194,9 @@ static void check_hung_uninterruptible_tasks(int this_cpu)
read_unlock(&tasklist_lock);
}
+
+static int check_cpu = -1;
+
/*
* The watchdog thread - runs every second and touches the timestamp.
*/
@@ -219,8 +222,6 @@ static int watchdog(void *__bind_cpu)
/*
* Only do the hung-tasks check on one CPU:
*/
- check_cpu = any_online_cpu(cpu_online_map);
-
if (this_cpu != check_cpu)
continue;
@@ -255,6 +256,7 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
break;
case CPU_ONLINE:
case CPU_ONLINE_FROZEN:
+ check_cpu = any_online_cpu(cpu_online_map);
wake_up_process(per_cpu(watchdog_task, hotcpu));
break;
#ifdef CONFIG_HOTPLUG_CPU
@@ -265,6 +267,14 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
/* Unbind so it can run. Fall thru. */
kthread_bind(per_cpu(watchdog_task, hotcpu),
any_online_cpu(cpu_online_map));
+ case CPU_DOWN_PREPARE:
+ case CPU_DOWN_PREPARE_FROZEN:
+ if (hotcpu == check_cpu) {
+ cpumask_t temp_cpu_online_map = cpu_online_map;
+ cpu_clear(hotcpu, temp_cpu_online_map);
+ check_cpu = any_online_cpu(temp_cpu_online_map);
+ }
+ break;
case CPU_DEAD:
case CPU_DEAD_FROZEN:
p = per_cpu(watchdog_task, hotcpu);
>
> Ingo
Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
WARNING: multiple messages have this Message-ID (diff)
From: Gautham R Shenoy <ego@in.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Slaby <jirislaby@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, "Rafael J. Wysocki" <rjw@sisk.pl>,
Arjan van de Ven <arjan@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linux-pm mailing list <linux-pm@lists.linux-foundation.org>,
Dipankar Sarma <dipankar@in.ibm.com>
Subject: Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]
Date: Mon, 10 Dec 2007 16:38:18 +0530 [thread overview]
Message-ID: <20071210110818.GC12880@in.ibm.com> (raw)
In-Reply-To: <20071210102157.GB31103@elte.hu>
On Mon, Dec 10, 2007 at 11:21:57AM +0100, Ingo Molnar wrote:
>
> * Gautham R Shenoy <ego@in.ibm.com> wrote:
>
> > > i'm wondering, what's the proper CPU-hotplug safe sequence here
> > > then? I'm picking a CPU number from cpu_online_map, and that CPU
> > > could go away while i'm still using it, right? What's saving us
> > > here?
> >
> > In this particular case, we are trying to see if any task on a
> > particular cpu has not been scheduled for a really long time. If we do
> > this check on a cpu which has gone offline, then a) If the tasks have
> > not been migrated on to another cpu yet, we will still perform that
> > check and yell if something has been holding any task for a
> > sufficiently long time. b) If the tasks have been migrated off, then
> > we have nothing to check.
>
> say we've got 100 CPUs, so we've got 100 watchdog tasks running - one
> for each CPU. Checking for hung tasks is a global operation not a
> per-CPU operation (we iterate over the global tasklist), hence only one
> CPU should really be calling this function. That online-cpus logic
> achieves this by picking a single CPU. Perhaps it would be better to
> keep a hung_task_checker_cpu variable that is driven from a
> CPU-hotplug-down notifier? That way if a CPU is brought down we can
> update hung_task_checker_cpu to another, still-online CPU. (this would
> also be faster, because event-driven)
Do you mean something like this?
From: Gautham R Shenoy <ego@in.ibm.com>
softlockup: update check_cpu during cpu-hotplug
Update the check_cpu value during a cpu-hotplug operation
so that we don't check for hung tasks on a cpu which is about
to go offline.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linuxtronix.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
---
kernel/softlockup.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index 576eb9c..b1a8c7c 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -194,6 +194,9 @@ static void check_hung_uninterruptible_tasks(int this_cpu)
read_unlock(&tasklist_lock);
}
+
+static int check_cpu = -1;
+
/*
* The watchdog thread - runs every second and touches the timestamp.
*/
@@ -219,8 +222,6 @@ static int watchdog(void *__bind_cpu)
/*
* Only do the hung-tasks check on one CPU:
*/
- check_cpu = any_online_cpu(cpu_online_map);
-
if (this_cpu != check_cpu)
continue;
@@ -255,6 +256,7 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
break;
case CPU_ONLINE:
case CPU_ONLINE_FROZEN:
+ check_cpu = any_online_cpu(cpu_online_map);
wake_up_process(per_cpu(watchdog_task, hotcpu));
break;
#ifdef CONFIG_HOTPLUG_CPU
@@ -265,6 +267,14 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
/* Unbind so it can run. Fall thru. */
kthread_bind(per_cpu(watchdog_task, hotcpu),
any_online_cpu(cpu_online_map));
+ case CPU_DOWN_PREPARE:
+ case CPU_DOWN_PREPARE_FROZEN:
+ if (hotcpu == check_cpu) {
+ cpumask_t temp_cpu_online_map = cpu_online_map;
+ cpu_clear(hotcpu, temp_cpu_online_map);
+ check_cpu = any_online_cpu(temp_cpu_online_map);
+ }
+ break;
case CPU_DEAD:
case CPU_DEAD_FROZEN:
p = per_cpu(watchdog_task, hotcpu);
>
> Ingo
Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
next prev parent reply other threads:[~2007-12-10 11:08 UTC|newest]
Thread overview: 153+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-05 5:17 2.6.24-rc4-mm1 Andrew Morton
2007-12-05 9:15 ` 2.6.24-rc4-mm1: kobj changes fallout on powerpc Olof Johansson
2007-12-05 13:11 ` Kamalesh Babulal
2007-12-05 15:46 ` Greg KH
2007-12-05 14:12 ` 2.6.24-rc4-mm1 kobject changes broken with hvcs driver " Kamalesh Babulal
2007-12-05 15:47 ` Greg KH
2007-12-06 18:19 ` Balbir Singh
2007-12-06 18:50 ` Greg KH
2007-12-06 18:49 ` Kamalesh Babulal
2007-12-06 18:58 ` Balbir Singh
2007-12-06 19:21 ` Badari Pulavarty
2007-12-07 1:29 ` Balbir Singh
2007-12-06 20:31 ` Greg KH
2007-12-06 23:54 ` Badari Pulavarty
2007-12-07 0:32 ` Greg KH
2007-12-07 3:02 ` Kamalesh Babulal
2007-12-07 5:14 ` Greg KH
2007-12-07 22:01 ` Balbir Singh
2007-12-05 23:41 ` 2.6.24-rc4-mm1: hostbyte=0x01 driverbyte=0x00 (now bisected) Alexey Dobriyan
2007-12-06 7:52 ` Hannes Reinecke
2007-12-06 12:08 ` Jens Axboe
2007-12-06 12:08 ` Jens Axboe
2007-12-06 19:19 ` Alexey Dobriyan
2007-12-06 3:15 ` 2.6.24-rc4-mm1 Kernel build fails on S390x Kamalesh Babulal
2007-12-06 7:19 ` Andrew Morton
2007-12-06 6:59 ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-06 7:09 ` 2.6.24-rc4-mm1 David Miller
2007-12-07 13:16 ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-12 17:57 ` 2.6.24-rc4-mm1 Cedric Le Goater
2007-12-06 7:35 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-10 12:24 ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-10 20:05 ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-12 19:21 ` 2.6.24-rc4-mm1 Cedric Le Goater
2007-12-13 17:38 ` tcp_sacktag_one() WARNING (was Re: 2.6.24-rc4-mm1) Cedric Le Goater
2007-12-06 11:49 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 12:04 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-06 12:04 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-06 19:18 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 19:18 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 19:38 ` 2.6.24-rc4-mm1 Greg KH
2007-12-06 20:04 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 20:04 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 22:04 ` 2.6.24-rc4-mm1 Kay Sievers
2007-12-06 22:04 ` [dm-devel] " Kay Sievers
2007-12-06 22:12 ` Alasdair G Kergon
2007-12-06 22:12 ` [dm-devel] " Alasdair G Kergon
2007-12-06 23:12 ` Valdis.Kletnieks
2007-12-06 23:12 ` [dm-devel] " Valdis.Kletnieks
2007-12-06 23:24 ` Kay Sievers
2007-12-06 23:24 ` [dm-devel] " Kay Sievers
2007-12-07 18:20 ` Valdis.Kletnieks
2007-12-07 18:20 ` [dm-devel] " Valdis.Kletnieks
2007-12-07 18:44 ` Kay Sievers
2007-12-07 20:28 ` Valdis.Kletnieks
2007-12-07 20:28 ` [dm-devel] " Valdis.Kletnieks
2007-12-07 20:49 ` Kay Sievers
2007-12-07 20:49 ` [dm-devel] " Kay Sievers
2007-12-06 22:28 ` 2.6.24-rc4-mm1: VDSOSYM build error Laurent Riffard
2007-12-06 22:37 ` Andrew Morton
2007-12-06 23:28 ` Miles Lane
2007-12-06 23:34 ` Andrew Morton
2007-12-06 23:47 ` Miles Lane
2007-12-07 10:36 ` Ingo Molnar
2007-12-07 1:14 ` [PATCH x86/mm] x86 vDSO: canonicalize sysenter .eh_frame Roland McGrath
2007-12-07 1:27 ` Harvey Harrison
2007-12-07 3:27 ` Miles Lane
2007-12-07 9:44 ` Ingo Molnar
2007-12-07 2:12 ` 2.6.24-rc4-mm1 Dave Young
2007-12-07 22:22 ` 2.6.24-rc4-mm1 Luis R. Rodriguez
2007-12-10 1:07 ` 2.6.24-rc4-mm1 Dave Young
2007-12-09 17:55 ` 2.6.24-rc4-mm1 Nick Kossifidis
2007-12-07 8:35 ` [PATCH BUGFIX] hid: the `bit' in hidinput_mapping_quirks() is an out parameter Fengguang Wu
2007-12-07 8:35 ` Fengguang Wu
2007-12-10 10:03 ` Jiri Kosina
2007-12-07 14:34 ` broken suspend (sched related) [Was: 2.6.24-rc4-mm1] Jiri Slaby
2007-12-07 14:34 ` Jiri Slaby
2007-12-07 15:11 ` Ingo Molnar
2007-12-07 15:11 ` Ingo Molnar
2007-12-07 17:51 ` Ingo Molnar
2007-12-07 17:51 ` Ingo Molnar
2007-12-08 8:10 ` Jiri Slaby
2007-12-08 8:10 ` Jiri Slaby
2007-12-08 8:39 ` Ingo Molnar
2007-12-08 9:23 ` Jiri Slaby
2007-12-08 15:24 ` Ingo Molnar
2007-12-08 15:24 ` Ingo Molnar
2007-12-08 17:34 ` Jiri Slaby
2007-12-08 17:34 ` Jiri Slaby
2007-12-08 17:43 ` Jiri Slaby
2007-12-08 17:43 ` Jiri Slaby
2007-12-09 8:06 ` Ingo Molnar
2007-12-09 8:06 ` Ingo Molnar
2007-12-08 23:12 ` Jiri Slaby
2007-12-09 7:46 ` Ingo Molnar
2007-12-09 7:46 ` Ingo Molnar
2007-12-09 9:09 ` Jiri Slaby
2007-12-09 9:09 ` Jiri Slaby
2007-12-10 8:19 ` Gautham R Shenoy
2007-12-10 8:19 ` Gautham R Shenoy
2007-12-10 8:55 ` Jiri Slaby
2007-12-10 9:10 ` Ingo Molnar
2007-12-10 9:10 ` Ingo Molnar
2007-12-10 10:15 ` Gautham R Shenoy
2007-12-10 10:15 ` Gautham R Shenoy
2007-12-10 10:21 ` Ingo Molnar
2007-12-10 10:21 ` Ingo Molnar
2007-12-10 11:08 ` Gautham R Shenoy [this message]
2007-12-10 11:08 ` Gautham R Shenoy
2007-12-10 11:28 ` Ingo Molnar
2007-12-10 11:49 ` Gautham R Shenoy
2007-12-10 11:49 ` Gautham R Shenoy
2007-12-10 11:28 ` Ingo Molnar
2007-12-10 8:55 ` Jiri Slaby
2007-12-10 9:29 ` Ingo Molnar
2007-12-10 9:29 ` Ingo Molnar
2007-12-08 23:12 ` Jiri Slaby
2007-12-08 9:23 ` Jiri Slaby
2007-12-08 8:39 ` Ingo Molnar
2007-12-07 18:20 ` [PATCH] md: balance braces in raid5 debug code Mariusz Kozlowski
2007-12-07 23:56 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on sparc64 Mariusz Kozlowski
2007-12-08 0:04 ` Mariusz Kozlowski
2007-12-08 0:08 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on Andrew Morton
2007-12-08 0:08 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on sparc64 Andrew Morton
2007-12-08 9:17 ` Mariusz Kozlowski
2007-12-08 9:17 ` Mariusz Kozlowski
2007-12-11 10:15 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on David Miller
2007-12-11 10:15 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on sparc64 David Miller
2007-12-08 18:20 ` 2.6.24-rc4-mm1: some issues " Mariusz Kozlowski
2007-12-08 18:20 ` Mariusz Kozlowski
2007-12-08 18:22 ` Andrew Morton
2007-12-08 18:22 ` Andrew Morton
2007-12-09 8:45 ` David Miller
2007-12-09 8:45 ` David Miller
2007-12-09 9:03 ` Andrew Morton
2007-12-09 9:03 ` Andrew Morton
2007-12-10 14:48 ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-10 21:11 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 14:12 ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-11 16:20 ` 2.6.24-rc4-mm1 Martin Bligh
2007-12-11 16:59 ` 2.6.24-rc4-mm1 Randy Dunlap
2007-12-11 17:50 ` 2.6.24-rc4-mm1 Martin Bligh
[not found] ` <33307c790712110813h23def95dvd068b7226e9fcd36@mail.gmail.com>
2007-12-11 20:37 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 21:20 ` 2.6.24-rc4-mm1 Ingo Molnar
2007-12-11 21:26 ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 21:59 ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 22:10 ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 22:17 ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 23:15 ` 2.6.24-rc4-mm1 Randy Dunlap
2007-12-12 4:16 ` 2.6.24-rc4-mm1 Rik van Riel
2007-12-13 17:45 ` 2.6.24-rc4-mm1 - BUG in tcp_fragment Cedric Le Goater
2007-12-13 23:00 ` Ilpo Järvinen
2007-12-14 6:52 ` Cedric Le Goater
2007-12-14 20:14 ` [PATCH net-2.6.25] Revert recent TCP work Ilpo Järvinen
2007-12-16 22:21 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071210110818.GC12880@in.ibm.com \
--to=ego@in.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=jirislaby@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.