* [PATCH] syscall interface for cpu affinity
@ 2002-03-10 18:15 Robert Love
2002-03-10 20:29 ` Andreas Jaeger
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Robert Love @ 2002-03-10 18:15 UTC (permalink / raw)
To: torvalds; +Cc: linux-kernel
Linus,
I have updated the patch a bit and resycned to 2.5.6. Are you
interested? I believe a user interface for setting task CPU affinity is
useful and completes the rest of our sched_* syscalls. A syscall
implementation seems to be what everyone wants (I have a proc-interface,
too...)
This patch implements
int sched_set_affinity(pid_t pid, unsigned int len,
unsigned long *new_mask_ptr);
int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
unsigned long *user_mask_ptr)
which set and get the cpu affinity (task->cpus_allowed) for a task,
using the set_cpus_allowed function in Ingo's scheduler. The functions
properly support changes to cpus_allowed, implement security, and are
well-tested.
They are based on Ingo's older affinity syscall patch and my older
affinity proc patch.
Comments?
Robert Love
diff -urN linux-2.5.6/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
--- linux-2.5.6/arch/i386/kernel/entry.S Thu Mar 7 21:18:19 2002
+++ linux/arch/i386/kernel/entry.S Sun Mar 10 13:01:03 2002
@@ -717,6 +717,8 @@
.long SYMBOL_NAME(sys_fremovexattr)
.long SYMBOL_NAME(sys_tkill)
.long SYMBOL_NAME(sys_sendfile64)
+ .long SYMBOL_NAME(sys_sched_set_affinity) /* 240 */
+ .long SYMBOL_NAME(sys_sched_get_affinity)
.rept NR_syscalls-(.-sys_call_table)/4
.long SYMBOL_NAME(sys_ni_syscall)
diff -urN linux-2.5.6/include/asm-i386/unistd.h linux/include/asm-i386/unistd.h
--- linux-2.5.6/include/asm-i386/unistd.h Thu Mar 7 21:18:55 2002
+++ linux/include/asm-i386/unistd.h Sun Mar 10 13:03:41 2002
@@ -244,6 +244,8 @@
#define __NR_fremovexattr 237
#define __NR_tkill 238
#define __NR_sendfile64 239
+#define __NR_sched_set_affinity 240
+#define __NR_sched_get_affinity 241
/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
diff -urN linux-2.5.6/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.6/kernel/sched.c Thu Mar 7 21:18:19 2002
+++ linux/kernel/sched.c Sun Mar 10 12:59:26 2002
@@ -1215,6 +1215,95 @@
return retval;
}
+/**
+ * sys_sched_set_affinity - set the cpu affinity of a process
+ * @pid: pid of the process
+ * @len: length of new_mask
+ * @new_mask: user-space pointer to the new cpu mask
+ */
+asmlinkage int sys_sched_set_affinity(pid_t pid, unsigned int len,
+ unsigned long *new_mask_ptr)
+{
+ unsigned long new_mask;
+ task_t *p;
+ int retval;
+
+ if (len < sizeof(new_mask))
+ return -EINVAL;
+
+ if (copy_from_user(&new_mask, new_mask_ptr, sizeof(new_mask)))
+ return -EFAULT;
+
+ new_mask &= cpu_online_map;
+ if (!new_mask)
+ return -EINVAL;
+
+ read_lock(&tasklist_lock);
+
+ retval = -ESRCH;
+ p = find_process_by_pid(pid);
+ if (!p)
+ goto out_unlock;
+
+ retval = -EPERM;
+ if ((current->euid != p->euid) && (current->euid != p->uid) &&
+ !capable(CAP_SYS_NICE))
+ goto out_unlock;
+
+ retval = 0;
+#ifdef CONFIG_SMP
+ set_cpus_allowed(p, new_mask);
+#endif
+
+out_unlock:
+ read_unlock(&tasklist_lock);
+ return retval;
+}
+
+/**
+ * sys_sched_get_affinity - get the cpu affinity of a process
+ * @pid: pid of the process
+ * @user_len_ptr: userspace pointer to the length of the mask
+ * @user_mask_ptr: userspace pointer to the mask
+ */
+asmlinkage int sys_sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
+ unsigned long *user_mask_ptr)
+{
+ unsigned long mask;
+ unsigned int len, user_len;
+ task_t *p;
+ int retval;
+
+ len = sizeof(mask);
+
+ if (copy_from_user(&user_len, user_len_ptr, sizeof(user_len)))
+ return -EFAULT;
+
+ if (copy_to_user(user_len_ptr, &len, sizeof(len)))
+ return -EFAULT;
+
+ if (user_len < len)
+ return -EINVAL;
+
+ read_lock(&tasklist_lock);
+
+ retval = -ESRCH;
+ p = find_process_by_pid(pid);
+ if (!p)
+ goto out_unlock;
+
+ retval = 0;
+ mask = p->cpus_allowed & cpu_online_map;
+
+out_unlock:
+ read_unlock(&tasklist_lock);
+ if (retval)
+ return retval;
+ if (copy_to_user(user_mask_ptr, &mask, sizeof(mask)))
+ return -EFAULT;
+ return 0;
+}
+
asmlinkage long sys_sched_yield(void)
{
runqueue_t *rq;
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love @ 2002-03-10 20:29 ` Andreas Jaeger 2002-03-10 20:53 ` Robert Love 2002-03-10 22:05 ` Chris Wedgwood 2002-03-11 0:38 ` Andreas Ferber 2 siblings, 1 reply; 16+ messages in thread From: Andreas Jaeger @ 2002-03-10 20:29 UTC (permalink / raw) To: Robert Love; +Cc: torvalds, linux-kernel Robert Love <rml@tech9.net> writes: > Linus, > > I have updated the patch a bit and resycned to 2.5.6. Are you > interested? I believe a user interface for setting task CPU affinity is > useful and completes the rest of our sched_* syscalls. A syscall > implementation seems to be what everyone wants (I have a proc-interface, > too...) Please add the procinterface also! I've found it today (for 2.4.18) and it's much easier to use with existing programs. Andreas > This patch implements > > int sched_set_affinity(pid_t pid, unsigned int len, > unsigned long *new_mask_ptr); > > int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr, > unsigned long *user_mask_ptr) > > which set and get the cpu affinity (task->cpus_allowed) for a task, > using the set_cpus_allowed function in Ingo's scheduler. The functions > properly support changes to cpus_allowed, implement security, and are > well-tested. > > They are based on Ingo's older affinity syscall patch and my older > affinity proc patch. > > Comments? Please add it for all archs - this is not only interesting for x86, Andreas [...] -- Andreas Jaeger SuSE Labs aj@suse.de private aj@arthur.inka.de http://www.suse.de/~aj ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 20:29 ` Andreas Jaeger @ 2002-03-10 20:53 ` Robert Love 2002-03-10 21:03 ` Andreas Jaeger 2002-03-10 23:45 ` Jeff Garzik 0 siblings, 2 replies; 16+ messages in thread From: Robert Love @ 2002-03-10 20:53 UTC (permalink / raw) To: Andreas Jaeger; +Cc: torvalds, linux-kernel On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote: > Please add the procinterface also! I've found it today (for 2.4.18) > and it's much easier to use with existing programs. I agree and I really like the proc-interface. There is something uber cool about: cat 1 > /proc/pid/affinity I have a patch for 2.5.6 for proc-based affinity interface here: http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch I suspect, however, that despite both patches being small we really only want to pick and standardize on one. The syscall interface has two main things going for it against a proc-based implementation: it is faster and /proc may not be mounted. The masses have spoken on this issue. Note you can use the syscall interface with existing programs, too. Just write a program to take in a pid and mask and call sched_set_affinity. > Please add it for all archs - this is not only interesting for x86, I'll send Linus the patch for other arches if/when he accepts this patch - I have no problem with that. Robert Love ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 20:53 ` Robert Love @ 2002-03-10 21:03 ` Andreas Jaeger 2002-03-10 22:23 ` Andreas Schwab 2002-03-10 23:56 ` Andreas Ferber 2002-03-10 23:45 ` Jeff Garzik 1 sibling, 2 replies; 16+ messages in thread From: Andreas Jaeger @ 2002-03-10 21:03 UTC (permalink / raw) To: Robert Love; +Cc: torvalds, linux-kernel Robert Love <rml@tech9.net> writes: > On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote: > >> Please add the procinterface also! I've found it today (for 2.4.18) >> and it's much easier to use with existing programs. > > I agree and I really like the proc-interface. There is something uber > cool about: > > cat 1 > /proc/pid/affinity I agree. > I have a patch for 2.5.6 for proc-based affinity interface here: > > http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch > > I suspect, however, that despite both patches being small we really only > want to pick and standardize on one. The syscall interface has two main > things going for it against a proc-based implementation: it is faster > and /proc may not be mounted. The masses have spoken on this issue. > > Note you can use the syscall interface with existing programs, too. > Just write a program to take in a pid and mask and call > sched_set_affinity. What I need at the moment is a wrapper - and you can do it two ways: $ run_with_affinity 1 program arguments... $ (cat 1 > /proc/self/affinity; program arguments...) The second one is much easier coded ;-) >> Please add it for all archs - this is not only interesting for x86, > > I'll send Linus the patch for other arches if/when he accepts this patch > - I have no problem with that. Thanks, Andreas -- Andreas Jaeger SuSE Labs aj@suse.de private aj@arthur.inka.de http://www.suse.de/~aj ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 21:03 ` Andreas Jaeger @ 2002-03-10 22:23 ` Andreas Schwab 2002-03-10 23:56 ` Andreas Ferber 1 sibling, 0 replies; 16+ messages in thread From: Andreas Schwab @ 2002-03-10 22:23 UTC (permalink / raw) To: Andreas Jaeger; +Cc: Robert Love, torvalds, linux-kernel Andreas Jaeger <aj@suse.de> writes: |> What I need at the moment is a wrapper - and you can do it two ways: |> |> $ run_with_affinity 1 program arguments... |> $ (cat 1 > /proc/self/affinity; program arguments...) |> |> The second one is much easier coded ;-) Apparently not, since that should be $ (echo 1 > /proc/self/affinity; program arguments...) :-) Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE GmbH, Deutschherrnstr. 15-19, D-90429 Nürnberg Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 21:03 ` Andreas Jaeger 2002-03-10 22:23 ` Andreas Schwab @ 2002-03-10 23:56 ` Andreas Ferber 1 sibling, 0 replies; 16+ messages in thread From: Andreas Ferber @ 2002-03-10 23:56 UTC (permalink / raw) To: Andreas Jaeger; +Cc: Robert Love, linux-kernel On Sun, Mar 10, 2002 at 10:03:02PM +0100, Andreas Jaeger wrote: > > > > Note you can use the syscall interface with existing programs, too. > > Just write a program to take in a pid and mask and call > > sched_set_affinity. > What I need at the moment is a wrapper - and you can do it two ways: > > $ run_with_affinity 1 program arguments... > $ (cat 1 > /proc/self/affinity; program arguments...) > > The second one is much easier coded ;-) $ (set_affinity 1; program arguments...) set_affinity just calls sched_set_affinity(getppid()), and everything is fine (and even shorter to type) :-) Andreas -- Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG --------------------------------------------------------- +49 521 1365800 - af@devcon.net - www.devcon.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 20:53 ` Robert Love 2002-03-10 21:03 ` Andreas Jaeger @ 2002-03-10 23:45 ` Jeff Garzik 1976-03-03 15:58 ` Tim Hockin 1 sibling, 1 reply; 16+ messages in thread From: Jeff Garzik @ 2002-03-10 23:45 UTC (permalink / raw) To: Robert Love; +Cc: Andreas Jaeger, torvalds, linux-kernel Robert Love wrote: > > On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote: > > > Please add the procinterface also! I've found it today (for 2.4.18) > > and it's much easier to use with existing programs. > > I agree and I really like the proc-interface. There is something uber > cool about: > > cat 1 > /proc/pid/affinity > > I have a patch for 2.5.6 for proc-based affinity interface here: > > http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch Anon! But there is something uber-ugly about constantly jamming more and more stuff into procfs without thinking or planning long term... I vote for the non-procfs approach :) -- Jeff Garzik | Usenet Rule #2 (John Gilmore): "The Net interprets Building 1024 | censorship as damage and routes around it." MandrakeSoft | ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 23:45 ` Jeff Garzik @ 1976-03-03 15:58 ` Tim Hockin 2002-03-11 0:08 ` Jeff Garzik 0 siblings, 1 reply; 16+ messages in thread From: Tim Hockin @ 1976-03-03 15:58 UTC (permalink / raw) To: Jeff Garzik; +Cc: Robert Love, Andreas Jaeger, torvalds, linux-kernel > Anon! But there is something uber-ugly about constantly jamming more > and more stuff into procfs without thinking or planning long term... I > vote for the non-procfs approach :) At some point I had done a port of SGI's pset/sysmp interface to linux 2.2. As far as I know, lots of people are still using it. I haven't ported it to 2.4 for various reasons, but I have to say - IT IS A MUCH BETTER INTERFACE than all these ad-hoc cpus_allowed bits. If I thought that it had a chance of inclusion, maybe I'd port it up, but last I heard none of the "core" people wanted it. If we are going to pick an affinity system, please, let's consider sysmp(). Tim ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 1976-03-03 15:58 ` Tim Hockin @ 2002-03-11 0:08 ` Jeff Garzik 2002-03-11 0:32 ` Tim Hockin 0 siblings, 1 reply; 16+ messages in thread From: Jeff Garzik @ 2002-03-11 0:08 UTC (permalink / raw) To: Tim Hockin; +Cc: Robert Love, Andreas Jaeger, torvalds, linux-kernel [-- Attachment #1: Type: text/plain, Size: 642 bytes --] Tim Hockin wrote: > If we are going to pick an affinity system, please, let's consider sysmp(). Not too bad. I picked a random sysmp(2) man page off the net (attached for ease of other's reference). It duplicates some stuff set elsewhere, and seems more than a bit like ioctl(2) by another name, but doesn't seem too bad. Note we should be careful not to overengineer the interface, either... Just setting a bitmask does seem a bit limiting when thinking about the future, agreed. -- Jeff Garzik | Usenet Rule #2 (John Gilmore): "The Net interprets Building 1024 | censorship as damage and routes around it." MandrakeSoft | [-- Attachment #2: sysmp.man.txt --] [-- Type: text/plain, Size: 10690 bytes --] sysmp - multiprocessing control C SYNOPSIS #include <sys/types.h> #include <sys/sysmp.h> #include <sys/sysinfo.h> /* for SAGET and MINFO structures */ int sysmp (int cmd, ...); ptrdiff_t sysmp (int cmd, ...);" DESCRIPTION sysmp provides control/information for miscellaneous system services. This system call is usually used by system programs and is not intended for general use. The arguments arg1, arg2, arg3, arg4 are provided for command-dependent use. As specified by cmd, the following commands are available: MP_CLEARCFSSTAT MP_CLEARNFSSTAT MP_NUMA_GETCPUNODEMAP MP_NUMA_GETDISTMATRIX These are all interfaces that are used to implement various system library functions. They are all subject to change and should not be called directly by applications. MP_PGSIZE The page size of the system is returned (see getpagesize(2)). MP_SCHED Interface for the schedctl(2) system call. MP_NPROCS Returns the number of processors physically configured. MP_NAPROCS Returns the number of processors that are available to schedule unrestricted processes. MP_STAT The processor ids and status flag bits of the physically configured processors are copied into an array of pda_stat structures to which arg1 points. The array must be large enough to hold as many pda_stat structures as the number of processors returned by the MP_NPROCS sysmp command. The pda_stat structure and the various status bits are defined in <sys/pda.h>. MP_EMPOWER The processor number given by arg1, interpreted as an 'int', is empowered to run any unrestricted processes. This is the default for all processors. This command requires superuser authority. MP_RESTRICT The processor number given by arg1, interpreted as an 'int', is restricted from running any processes except those assigned to it by a MP_MUSTRUN or MP_MUSTRUN_PID command, a runon(1) command or because of hardware necessity. Note that processor 0 cannot be restricted. This command requires superuser authority. On Challenge Series machines, all timers belonging to the processor are moved to the processor that owns the clock as reported by MP_CLOCK. MP_ISOLATE The processor number given by arg1, interpreted as an 'int', is isolated from running any processes except those assigned to it by a MP_MUSTRUN command, a runon(1) command or because of hardware necessity. Instruction cache and Translation Lookaside Buffer synchronization across processors in the system is minimized or delayed on an isolated processor until system services are requested. Note that processor 0 cannot be isolated. This command requires superuser authority. On Challenge Series machines, all timers belonging to the processor are moved to the processor that owns the clock as reported by MP_CLOCK. MP_UNISOLATE The processor number given by arg1, interpreted as an 'int', is unisolated and empowered to run any unrestricted processes. This is the default system configuration for all processors. This command requires superuser authority. MP_PREEMPTIVE The processor number given by arg1, interpreted as an 'int', has its clock scheduler enabled. This is the default for all processors. This command requires superuser authority. MP_NONPREEMPTIVE The processor number given by arg1, interpreted as an 'int', has its clock scheduler disabled. Normal process time slicing is no longer enforced on that processor. As a result of turning off the clock interrupt, the interrupt latency on this processor will be lower. This command requires superuser authority and is allowed only on an isolated processor. This command is not allowed on the clock processor (see MP_CLOCK). MP_CLOCK The processor number given by arg1, interpreted as an 'int', is given charge of the operating system software clock (see timers(5)). This command requires superuser authority. MP_FASTCLOCK The processor number given by arg1, interpreted as an 'int', is given charge of the operating system software fast clock (see timers(5)). This command requires superuser authority. MP_MISER_GETREQUEST MP_MISER_SENDREQUEST MP_MISER_RESPOND MP_MISER_GETRESOURCE MP_MISER_SETRESOURCE MP_MISER_CHECKACCESS These are all interfaces that are used to implement various miser(1) functions. These are all subject to change and should not be called directly by applications. MP_MUSTRUN Assigns the calling process to run only on the processor number by arg1, interpreted as an 'int', except as required for communications with hardware devices. A process that has allocated a CC sync register (see ccsync(7m)) is restricted to running on a particular cpu. Attempts to reassign such a process to another cpu will fail until the CC sync register has been relinquished. MP_MUSTRUN_PID Assigns the process specified by arg2 to run only on the processor number specified by arg1, both interpreted as 'int', except as required for communications with hardware devices. A process that has allocated a CC sync register (see ccsync(7m)) is restricted to running on a particular cpu. Attempts to reassign such a process to another cpu will fail until the CC sync register has been relinquished. MP_GETMUSTRUN Returns the processor the current process has been set to run on using the MP_MUSTRUN command. If the current process has not been assigned to a specific processor, -1 is returned and errno is set to EINVAL. MP_GETMUSTRUN_PID Returns the processor that the process specified by arg1 has been set to run on using the MP_MUSTRUN or MP_MUSTRUN_PID command. If the process has not been assigned to a specific processor, -1 is returned and errno is set to EINVAL. MP_RUNANYWHERE Frees the calling process to run on whatever processor the system deems suitable. MP_RUNANYWHERE_PID Frees the process specified by arg1 to run on whatever processor the system deems suitable. MP_KERNADDR Returns the address of various kernel data structures. The structure returned is selected by arg1. The list of available structures is detailed in <sys/sysmp.h>. This option is used by many system programs to avoid having to look in /unix for the location of the data structures. MP_SASZ Returns the size of various system accounting structures. As above, the structure returned is governed by arg1. MP_SAGET1 Returns the contents of various system accounting structures. The information is only for the processor specified by arg4. As above, the structure returned is governed by arg1. arg2 points to a buffer in the address space of the calling process and arg3 specifies the maximum number of bytes to transfer. MP_SAGET Returns the contents of various system accounting structures. The information is summed across all processors before it is returned. As above, the structure returned is governed by arg1. arg2 points to a buffer in the address space of the calling process and arg3 specifies the maximum number of bytes to transfer. Possible errors from sysmp are: [EPERM] The effective user ID is not superuser. Many of the commands require superuser privilege. [EPERM] The user ID of the sending process is not superuser, and its real or effective user ID does not match the real, saved, or effective user ID of the receiving process. [ESRCH] No process corresponding to that specified by a MP_MUSTRUN_PID, MP_GETMUSTRUN_PID, or MP_RUNANYWHERE_PID could be found. [EINVAL] The processor named by a MP_EMPOWER, MP_RESTRICT, MP_CLOCK or MP_SAGET1 command does not exist. [EINVAL] The cmd argument is invalid. [EINVAL] The arg1 argument to a MP_KERNADDR command is invalid. [EINVAL] An attempt was made via MP_MUSTRUN or MP_MUSTRUN_PID to move a process owning a CC sync register from the cpu controlling the CC sync register. [EINVAL] The target of the MP_GETMUSTRUN command has not been set to run on a specific processor. [EBUSY] An attempt was made to restrict the only unrestricted processor or to restrict the master processor. [EFAULT] An invalid buffer address has been supplied by the calling process. SEE ALSO mpadmin(1), runon(1), getpagesize(2), schedctl(2), timers(5) DIAGNOSTICS Upon successful completion, the cmd dependent data is returned. Otherwise, a value of -1 is returned and errno is set to indicate the error. [-- Attachment #3: mpadmin.man.txt --] [-- Type: text/plain, Size: 5437 bytes --] mpadmin(1) mpadmin(1) NAME mpadmin - control and report processor status SYNOPSIS mpadmin -n mpadmin -u[processor] mpadmin -r[processor] mpadmin -c[processor] mpadmin -f[processor] mpadmin -I[processor] mpadmin -U[processor] mpadmin -D[processor] mpadmin -C[processor] mpadmin -s DESCRIPTION mpadmin provides control/information of processor status. Exactly one argument is accepted by mpadmin at each invocation. The following arguments are accepted: -n Report which processors are physically configured. The numbers of the physically configured processors are written to the standard output, one processor number per line. Processors are numbered beginning from 0. -u[processor] When no processor is specified, the numbers of the processors that are available to schedule unrestricted processes are written to the standard output. Otherwise, mpadmin enables the processor number processor to run any unrestricted processes. -r[processor] When no processor is specified, the numbers of the processors that are restricted from running any processes (except those assigned via the sysmp(MP_MUSTRUN) function, the runon(1) command, or because of hardware necessity) are written to the standard output. Otherwise, mpadmin restricts the processor numbered processor. -c[processor] When no processor is specified, the number of the processor that handles the operating system software clock is written to the standard output. Otherwise, operating system software clock handling is moved to the processor numbered processor. See timers(5) for more details. -f[processor] When no processor is specified, the number of the processor that handles the operating system fast clock is written to the standard output. Otherwise, operating system fast clock handling is moved to the processor numbered processor. See ftimer(1) and timers(5) for a description of the fast clock usage. -I[processor] When no processor is specified, the numbers of the processors that are isolated are written to the standard output. Otherwise, mpadmin isolates the processor numbered processor. An isolated processor is restricted as by the -r argument. In addition, instruction cache and Translation Lookaside Buffer synchronization are blocked, and synchronization is delayed until a system service is requested. -U[processor] When no processor is specified, the numbers of the processors that are not isolated are written to the standard output. Otherwise, mpadmin unisolates the processor numbered processor. -D[processor] When no processor is specified, the numbers of the processors that are not running the clock scheduler are written to the standard output. Otherwise, mpadmin disables the clock scheduler on the processor numbered processor. This makes that processor nonpreemptive, so that normal IRIX process time slicing is no longer enforced. Processes that run on a non-preemptive processor are not preempted because of timer interrupts. They are preempted only when requesting a system service that causes them to wait, or that makes a higher-priority process runnable (for example, posting a semaphore). -C[processor] When no processor is specified, the numbers of the processors that are running the clock scheduler are written to the standard output. Otherwise, mpadmin enables the clock scheduler on the processor numbered processor. Processes on a preemptive processor can be preempted at the end of their time slice. -s A summary of the unrestricted, restricted, isolated, preemptive and clock processor numbers is written to the standard output. SEE ALSO ftimer(1), runon(1), sysmp(2), timers(5). DIAGNOSTICS When an argument specifies a processor, 0 is returned on success, -1 on failure. Otherwise, the number of processors associated with argument is returned. WARNINGS It is not possible to restrict or isolate all processors. Processor 0 must never be restricted or isolated. BUGS Changing the clock processor may cause the system to lose a small amount of system time. When a processor is not provided as an argument, mpadmin's exit value will not exceed 255. If more than 255 processors exist, mpadmin will return 0. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-11 0:08 ` Jeff Garzik @ 2002-03-11 0:32 ` Tim Hockin 0 siblings, 0 replies; 16+ messages in thread From: Tim Hockin @ 2002-03-11 0:32 UTC (permalink / raw) To: Jeff Garzik Cc: Tim Hockin, Robert Love, Andreas Jaeger, torvalds, linux-kernel > > If we are going to pick an affinity system, please, let's consider sysmp(). > > Not too bad. I picked a random sysmp(2) man page off the net (attached > for ease of other's reference). so, there are actually two parts to sysmp(). The Way SGI used to it is with Pset (MP_PSET to sysmp()). They seem to have dropped exported support for PSets - don't know why. The idea is this. At boot the system creates a PSet with ALL processors, and one set for each single CPU. Root can define extra sets with specified CPUs, too. Processes can then run (commandline tool = 'runon') on a specific Pset. runon 3 yes # runs on PSET #3 This is ok, but it has several drawbacks: * user can not run on an arbitrary set of procs * defining a set for every combination of procs is ludicrous However, it has several upsides * disabling a CPU is as simple as removing it from a pset struct, not iterating over all tasks * conceptually hides the 'bitmask of CPUs' > It duplicates some stuff set elsewhere, and seems more than a bit like > ioctl(2) by another name, but doesn't seem too bad. Note we should be > careful not to overengineer the interface, either... At some point Ralf Baechle asked me to extend it more for IRIX compatibility. We may want to just drop that altogether. Several of the sysmp() interfaces can be handled at the library layer and re-routed to their existing interfaces. > Just setting a bitmask does seem a bit limiting when thinking about the > future, agreed. What is the future of the existing CPUs bitmask? Is it becoming something else? Perhaps we want to keep sysmp() in name and form, perhaps just in name, perhaps not at all. This is an area in which I have (had, but could get again) a lot of interest, but before I waste any more time on it, I'd like to actually co-design a feature set. What do we want: * unpriviliged ability to change current->pset? - any user can call sysmp(MP_RUNON) anytime * privileged ability only (runon becomes suid) - can "trap" processes to a CPU - it has been requested a lot * processor sets or just bitmasks/lists? - someone was working on memory sets, similarly to psets If we really want this, I definately want to help. :) Tim ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love 2002-03-10 20:29 ` Andreas Jaeger @ 2002-03-10 22:05 ` Chris Wedgwood 2002-03-10 22:11 ` Robert Love 2002-03-11 0:38 ` Andreas Ferber 2 siblings, 1 reply; 16+ messages in thread From: Chris Wedgwood @ 2002-03-10 22:05 UTC (permalink / raw) To: Robert Love; +Cc: torvalds, linux-kernel On Sun, Mar 10, 2002 at 01:15:03PM -0500, Robert Love wrote: I have updated the patch a bit and resycned to 2.5.6. Are you interested? I believe a user interface for setting task CPU affinity is useful and completes the rest of our sched_* syscalls. A syscall implementation seems to be what everyone wants (I have a proc-interface, too...) Can't wer just copy the IRIX interface here as some other pathces have in the past? --cw ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 22:05 ` Chris Wedgwood @ 2002-03-10 22:11 ` Robert Love 0 siblings, 0 replies; 16+ messages in thread From: Robert Love @ 2002-03-10 22:11 UTC (permalink / raw) To: Chris Wedgwood; +Cc: torvalds, linux-kernel On Sun, 2002-03-10 at 17:05, Chris Wedgwood wrote: > Can't wer just copy the IRIX interface here as some other pathces have > in the past? Is that psets? If so, no thanks. I want a simple, clean, quick implementation. I have seen patches that do a lot more than what my simple implementation does, and that really does not interest me and I suspect Ingo and others feel the same way. Setting a simple per-task bitmask that is inherited is all we need. Linux scheduler API is already our own standard. I'd rather support that (i.e. add another simple sched_* call) than some evil other interface - but that is just me. Robert Love ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love 2002-03-10 20:29 ` Andreas Jaeger 2002-03-10 22:05 ` Chris Wedgwood @ 2002-03-11 0:38 ` Andreas Ferber 2002-03-15 22:06 ` Stephen Samuel 2 siblings, 1 reply; 16+ messages in thread From: Andreas Ferber @ 2002-03-11 0:38 UTC (permalink / raw) To: Robert Love; +Cc: torvalds, linux-kernel On Sun, Mar 10, 2002 at 01:15:03PM -0500, Robert Love wrote: > > This patch implements > > int sched_set_affinity(pid_t pid, unsigned int len, > unsigned long *new_mask_ptr); > > int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr, > unsigned long *user_mask_ptr) > > which set and get the cpu affinity (task->cpus_allowed) for a task, > using the set_cpus_allowed function in Ingo's scheduler. The functions > properly support changes to cpus_allowed, implement security, and are > well-tested. Setting the affinity of a whole process group also makes sense IMHO. Therefore I think an interface more like the setpriority syscall for sched_set_affinity (with two parameters which/who instead of a single PID) would be more flexible, eg. int sched_set_affinity(int which, int who, unsigned int len, unsigned long *new_mask_ptr); with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according to the value of who. Getting the mask of a group of processes doesn't make sense though (what if they differ?), so the current interface of sched_get_affinity is just fine IMHO. Andreas -- Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG --------------------------------------------------------- +49 521 1365800 - af@devcon.net - www.devcon.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-11 0:38 ` Andreas Ferber @ 2002-03-15 22:06 ` Stephen Samuel 2002-03-16 0:43 ` Andreas Ferber 0 siblings, 1 reply; 16+ messages in thread From: Stephen Samuel @ 2002-03-15 22:06 UTC (permalink / raw) To: Andreas Ferber; +Cc: Robert Love, torvalds, linux-kernel Picking nits, but.... Andreas Ferber wrote: > Setting the affinity of a whole process group also makes sense IMHO. > Therefore I think an interface more like the setpriority syscall > for sched_set_affinity (with two parameters which/who instead of a > single PID) would be more flexible, eg. > > int sched_set_affinity(int which, int who, unsigned int len, > unsigned long *new_mask_ptr); > > with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according > to the value of who. I soule suggest that the order be int sched_set_affinity(int who, int which, unsigned int len, unsigned long *new_mask_ptr); This would have the {p,pg}id be the first thing that a programmer would see (likely more important than the 'which'.). -- Stephen Samuel +1(604)876-0426 samuel@bcgreen.com http://www.bcgreen.com/~samuel/ Powerful committed communication, reaching through fear, uncertainty and doubt to touch the jewel within each person and bring it to life. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-15 22:06 ` Stephen Samuel @ 2002-03-16 0:43 ` Andreas Ferber 2002-03-16 4:24 ` Stephen Samuel 0 siblings, 1 reply; 16+ messages in thread From: Andreas Ferber @ 2002-03-16 0:43 UTC (permalink / raw) To: Stephen Samuel; +Cc: Robert Love, torvalds, linux-kernel On Fri, Mar 15, 2002 at 02:06:04PM -0800, Stephen Samuel wrote: > > > > int sched_set_affinity(int which, int who, unsigned int len, > > unsigned long *new_mask_ptr); > > > > with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according > > to the value of who. Uh, who/which should be just the other way round in the description (but not in the prototype). Sorry. > I soule suggest that the order be > > int sched_set_affinity(int who, int which, unsigned int len, > unsigned long *new_mask_ptr); > > This would have the {p,pg}id be the first thing that a programmer > would see (likely more important than the 'which'.). See my correction above, does that address your concern? Andreas -- Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG --------------------------------------------------------- +49 521 1365800 - af@devcon.net - www.devcon.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity 2002-03-16 0:43 ` Andreas Ferber @ 2002-03-16 4:24 ` Stephen Samuel 0 siblings, 0 replies; 16+ messages in thread From: Stephen Samuel @ 2002-03-16 4:24 UTC (permalink / raw) To: Andreas Ferber; +Cc: Robert Love, torvalds, linux-kernel Almost... Same effect (mostly)... It does, however, leaves us arguing the linguistic semantics of which name 'who' should have. It seems to me that the most natural would be with 'who' being the 'name' of the target, and 'which' specifying which name space 'who' is operating in. UGH: messing with these names via pronouns is too confusing: ----------- How about this: int sched_set_affinity(int who, int which, unsigned int len, unsigned long *new_mask_ptr); 'who' being a {process, process-group or user } ID , and with 'which' being one of {PRIO_PROCESS, PRIO_PGRP, PRIO_USER}, respectively -- specifying which namespace 'who' operates in. I think that that is what you were trying to say, right? Andreas Ferber wrote: > On Fri, Mar 15, 2002 at 02:06:04PM -0800, Stephen Samuel wrote: > >> > >> > int sched_set_affinity(int which, int who, unsigned int len, >> > unsigned long *new_mask_ptr); >> > >> > with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according >> > to the value of who. >> > > Uh, who/which should be just the other way round in the description > (but not in the prototype). Sorry. > > >>I sould suggest that the order be >> >>int sched_set_affinity(int who, int which, unsigned int len, >> unsigned long *new_mask_ptr); >> >>This would have the {p,pg}id be the first thing that a programmer >>would see (likely more important than the 'which'.). -- Stephen Samuel +1(604)876-0426 samuel@bcgreen.com http://www.bcgreen.com/~samuel/ Powerful committed communication, reaching through fear, uncertainty and doubt to touch the jewel within each person and bring it to life. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2002-03-16 4:27 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love 2002-03-10 20:29 ` Andreas Jaeger 2002-03-10 20:53 ` Robert Love 2002-03-10 21:03 ` Andreas Jaeger 2002-03-10 22:23 ` Andreas Schwab 2002-03-10 23:56 ` Andreas Ferber 2002-03-10 23:45 ` Jeff Garzik 1976-03-03 15:58 ` Tim Hockin 2002-03-11 0:08 ` Jeff Garzik 2002-03-11 0:32 ` Tim Hockin 2002-03-10 22:05 ` Chris Wedgwood 2002-03-10 22:11 ` Robert Love 2002-03-11 0:38 ` Andreas Ferber 2002-03-15 22:06 ` Stephen Samuel 2002-03-16 0:43 ` Andreas Ferber 2002-03-16 4:24 ` Stephen Samuel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox