public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] stop on cpu lost
@ 2006-06-20  3:51 KAMEZAWA Hiroyuki
  2006-06-22  5:56 ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-06-20  3:51 UTC (permalink / raw)
  To: LKML; +Cc: ashok.raj, pavel, clameter, ak, nickpiggin, mingo, Andrew Morton

When the application is mis-configurated at cpu hot removal, a task's 
cpus_allowd can be empty. this patch adds sysctl to stop tasks whose 
cpus_allowed is empty.

I think there isn't one good answer to handle this problem and this is
depend on system management policy. In a system, forced migration is better 
than stop. In another, stopping tasks (and killing) will meet requirement.

How about this ?

-Kame

Now, when a task loses all of its allowed cpus because of cpu hot removal,
it will be foreced to migrate to not-allowed cpus.

In this case, the task is not properly reconfigurated by a user before
cpu-hot-removal. Here, the task (and system) is in a unexpeced wrong state.
This migration is maybe one of realistic workarounds. But sometimes it will be
harmfull.
(stealing other cpu time, making bugs in thread controllers, do some unexpected
 execution...)

This patch adds sysctl "sigstop_on_cpu_lost". When sigstop_on_cpu_lost==1,
a task which losts is cpu will be stopped by SIGSTOP.
Depends on system management policy, mis-configurated applications are stopped.

Signed-Off-By: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


 include/linux/sysctl.h |    1 +
 kernel/sched.c         |   14 ++++++++++++++
 kernel/sysctl.c        |   14 ++++++++++++++
 3 files changed, 29 insertions(+)

Index: linux-2.6.17/kernel/sched.c
===================================================================
--- linux-2.6.17.orig/kernel/sched.c
+++ linux-2.6.17/kernel/sched.c
@@ -4562,11 +4562,13 @@ wait_to_die:
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
+int sigstop_on_cpu_lost;
 /* Figure out where task on dead CPU should go, use force if neccessary. */
 static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
 {
 	int dest_cpu;
 	cpumask_t mask;
+	int force = 0;
 
 	/* On same node? */
 	mask = node_to_cpumask(cpu_to_node(dead_cpu));
@@ -4591,8 +4593,20 @@ static void move_task_off_dead_cpu(int d
 			printk(KERN_INFO "process %d (%s) no "
 			       "longer affine to cpu%d\n",
 			       tsk->pid, tsk->comm, dead_cpu);
+		/*
+		 * This thread is not properly reconfigurated before cpu hot
+		 * remove. This means this process is in the wrong state now.
+		 * If system management policy doesn't allow mis-configurated
+		 * applications, this process should be stopped.
+		 */
+		if (tsk->mm && sigstop_on_cpu_lost)
+			force = 1;
 	}
 	__migrate_task(tsk, dead_cpu, dest_cpu);
+
+	if (force) {
+		force_sig_specific(SIGSTOP, tsk);
+	}
 }
 
 /*
Index: linux-2.6.17/kernel/sysctl.c
===================================================================
--- linux-2.6.17.orig/kernel/sysctl.c
+++ linux-2.6.17/kernel/sysctl.c
@@ -127,6 +127,10 @@ extern int sysctl_hz_timer;
 extern int acct_parm[];
 #endif
 
+#ifdef CONFIG_HOTPLUG_CPU
+extern int sigstop_on_cpu_lost;
+#endif
+
 #ifdef CONFIG_IA64
 extern int no_unaligned_warning;
 #endif
@@ -683,6 +687,16 @@ static ctl_table kern_table[] = {
 		.proc_handler	= &proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_HOTPLUG_CPU
+	{
+		.ctl_name	= KERN_STOP_ON_CPU_LOST,
+		.procname	= "sigstop_on_cpu_lost",
+		.data		= &sigstop_on_cpu_lost,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
 	{ .ctl_name = 0 }
 };
 
Index: linux-2.6.17/include/linux/sysctl.h
===================================================================
--- linux-2.6.17.orig/include/linux/sysctl.h
+++ linux-2.6.17/include/linux/sysctl.h
@@ -148,6 +148,7 @@ enum
 	KERN_SPIN_RETRY=70,	/* int: number of spinlock retries */
 	KERN_ACPI_VIDEO_FLAGS=71, /* int: flags for setting up video after ACPI sleep */
 	KERN_IA64_UNALIGNED=72, /* int: ia64 unaligned userland trap enable */
+	KERN_STOP_ON_CPU_LOST=73, /* int: SIGSTOP when a task losts its cpus */
 };
 
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2006-06-22 21:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-20  3:51 [PATCH] stop on cpu lost KAMEZAWA Hiroyuki
2006-06-22  5:56 ` Andrew Morton
2006-06-22  6:14   ` Christoph Lameter
2006-06-22 15:08   ` Nathan Lynch
2006-06-22 15:45     ` Randy.Dunlap
2006-06-22 15:45       ` Christoph Lameter
2006-06-22 16:05         ` KAMEZAWA Hiroyuki
2006-06-22 16:14           ` Christoph Lameter
2006-06-22 16:24           ` Randy.Dunlap
2006-06-22 17:04             ` Nathan Lynch
2006-06-22 17:20               ` KAMEZAWA Hiroyuki
2006-06-22 18:22             ` Pavel Machek
2006-06-22 18:35               ` Christoph Lameter
2006-06-22 18:37                 ` Pavel Machek
2006-06-22 18:54               ` Hugh Dickins
2006-06-22 19:27                 ` Nick Piggin
2006-06-22 19:46                   ` Hugh Dickins
2006-06-22 19:57                     ` Nick Piggin
2006-06-22 20:25                       ` Hugh Dickins
2006-06-22 21:44                   ` Pavel Machek
2006-06-22 19:52               ` Jeremy Fitzhardinge
2006-06-22 21:46                 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox