All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Ingo Molnar <mingo@elte.hu>
Cc: linux-kernel@vger.kernel.org, torvalds@transmeta.com
Subject: Re: [patch] "fully HT-aware scheduler" support, 2.5.31-BK-curr
Date: Wed, 28 Aug 2002 16:32:42 +1000	[thread overview]
Message-ID: <20020828163242.2c84747f.rusty@rustcorp.com.au> (raw)
In-Reply-To: <Pine.LNX.4.44.0208270226190.12947-100000@localhost.localdomain>

On Tue, 27 Aug 2002 03:44:23 +0200 (CEST)
Ingo Molnar <mingo@elte.hu> wrote:

> The following properties have to be provided by a scheduler that wants to
> be 'fully HT-aware':

Thanks Mingo, I can take this off my TODO list (your implementation even looks
the same 8).

>  - HT-aware affinity.
> 
>    Tasks should attempt to 'stick' to physical CPUs, not logical CPUs.

Linus disagreed with this before when I discussed it with him, and with the
current (stupid, non-portable, broken) set_affinity syscall he's right.

You don't know if someone said "schedule me on cpu 0" because they really
want to be scheduled on CPU 0, or because they really *don't* want to be
scheduled on CPU 1 (where something else is running).  You can't just assume
they are equivalent if they are the same physical CPU.

My modified set_affinity syscall (which takes a "include/exclude" flag)
allows the arch to make this decision (eventually) since you know what the
user wants (it also means that you know what to do if they give you a
short bitmap, or a new cpu comes online/goes offline).

(Requires previous patches, so doesn't apply as is, but you get the idea).

Cheers,
Rusty.
-- 
   there are those who do and those who hang on and you don't see too
   many doers quoting their contemporaries.  -- Larry McVoy

Name: Modified set_affinity/get_affinity syscalls
Author: Rusty Russell
Status: Experimental
Depends: Hotcpu/cpumask.patch.gz

D: This allows userspace to have cpu affinity control without needing
D: to know the size of kernel datastructures and allows them to
D: control what happens when new CPUs are brought online.  It also
D: means that in the future we can sanely interpret affinity in
D: HyperThreading.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .29333-linux-2.5.31/include/linux/affinity.h .29333-linux-2.5.31.updated/include/linux/affinity.h
--- .29333-linux-2.5.31/include/linux/affinity.h	1970-01-01 10:00:00.000000000 +1000
+++ .29333-linux-2.5.31.updated/include/linux/affinity.h	2002-08-13 16:04:44.000000000 +1000
@@ -0,0 +1,9 @@
+#ifndef _LINUX_AFFINITY_H
+#define _LINUX_AFFINITY_H
+enum {
+	/* Set affinity to these processors */
+	LINUX_AFFINITY_INCLUDE,
+	/* Set affinity to all *but* these processors */
+	LINUX_AFFINITY_EXCLUDE,
+};
+#endif
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .29333-linux-2.5.31/kernel/sched.c .29333-linux-2.5.31.updated/kernel/sched.c
--- .29333-linux-2.5.31/kernel/sched.c	2002-08-13 16:04:25.000000000 +1000
+++ .29333-linux-2.5.31.updated/kernel/sched.c	2002-08-13 16:18:50.000000000 +1000
@@ -25,6 +25,7 @@
 #include <asm/mmu_context.h>
 #include <linux/interrupt.h>
 #include <linux/completion.h>
+#include <linux/affinity.h>
 #include <linux/kernel_stat.h>
 #include <linux/security.h>
 #include <linux/notifier.h>
@@ -1540,21 +1541,44 @@ out_unlock:
  * @len: length in bytes of the bitmask pointed to by user_mask_ptr
  * @user_mask_ptr: user-space pointer to the new cpu mask
  */
-asmlinkage int sys_sched_setaffinity(pid_t pid, unsigned int len,
-				      unsigned long *user_mask_ptr)
+asmlinkage int sys_sched_setaffinity(pid_t pid,
+				     int include,
+				     unsigned int len,
+				     unsigned long *user_mask_ptr)
 {
 	DECLARE_BITMAP(new_mask, NR_CPUS);
+	unsigned char c;
 	int retval;
 	task_t *p;
 
-	if (len < sizeof(new_mask))
-		return -EINVAL;
-
-	if (copy_from_user(&new_mask, user_mask_ptr, sizeof(new_mask)))
+	memset(&new_mask, 0, sizeof(new_mask));
+	if (copy_from_user(&new_mask, user_mask_ptr,
+			   min((size_t)len, sizeof(new_mask))))
 		return -EFAULT;
 
-	if (any_online_cpu(new_mask) == NR_CPUS)
+	/* longer is OK, as long as they don't actually set any of the bits. */
+	for (i = sizeof(new_mask); i < len; i++) {
+		if (get_user(c, user_mask_ptr+i))
+			return -EFAULT;
+		if (c != 0)
+			return -ENOENT;
+	}
+
+	/* Invert the mask in the exclude case. */
+	switch (include) {
+	case LINUX_AFFINITY_EXCLUDE:
+		for (i = 0; i < BITS_TO_LONG(NR_CPUS); i++)
+			new_mask[i] ^= ~0UL;
+		break;
+	case LINUX_AFFINITY_INCLUDE:
+		break;
+	default:
 		return -EINVAL;
+	}
+
+	/* Must mention at least one online CPU */
+	if (any_online_cpu(new_mask) == NR_CPUS)
+		return -EWOULDBLOCK; /* This is kinda true */
 
 	read_lock(&tasklist_lock);
 
@@ -1590,37 +1614,26 @@ out_unlock:
  * @pid: pid of the process
  * @len: length in bytes of the bitmask pointed to by user_mask_ptr
  * @user_mask_ptr: user-space pointer to hold the current cpu mask
+ * Returns the size required to hold the complete cpu mask.
  */
 asmlinkage int sys_sched_getaffinity(pid_t pid, unsigned int len,
-				      unsigned long *user_mask_ptr)
+				     void *user_mask_ptr)
 {
-	unsigned int real_len, i;
 	DECLARE_BITMAP(mask, NR_CPUS);
-	int retval;
 	task_t *p;
 
-	real_len = sizeof(mask);
-	if (len < real_len)
-		return -EINVAL;
-
 	read_lock(&tasklist_lock);
-
-	retval = -ESRCH;
 	p = find_process_by_pid(pid);
-	if (!p)
-		goto out_unlock;
-
-	retval = 0;
-	for (i = 0; i < ARRAY_SIZE(mask); i++)
-		mask[i] = (p->cpus_allowed[i] & cpu_online_map[i]);
-
-out_unlock:
+	if (!p) {
+		read_unlock(&tasklist_lock);
+		return -ESRCH;
+	}
+	memcpy(mask, p->cpus_allowed, sizeof(mask));
 	read_unlock(&tasklist_lock);
-	if (retval)
-		return retval;
-	if (copy_to_user(user_mask_ptr, &mask, real_len))
+
+	if (copy_to_user(user_mask_ptr, &mask, len))
 		return -EFAULT;
-	return real_len;
+	return sizeof(mask);
 }
 
 /**

  reply	other threads:[~2002-08-28  6:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-27  1:44 [patch] "fully HT-aware scheduler" support, 2.5.31-BK-curr Ingo Molnar
2002-08-28  6:32 ` Rusty Russell [this message]
2002-08-28 17:16   ` Ingo Molnar
2002-08-29  1:28     ` Rusty Russell
2002-09-01  1:48 ` Rusty Russell
2002-09-03 23:54 ` Michael Hohnbaum
2002-09-04  7:45   ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2002-12-17 17:27 Walton, Shane

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020828163242.2c84747f.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.