* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 23:45 ` Jeff Garzik
@ 1976-03-03 15:58 ` Tim Hockin
2002-03-11 0:08 ` Jeff Garzik
0 siblings, 1 reply; 16+ messages in thread
From: Tim Hockin @ 1976-03-03 15:58 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Robert Love, Andreas Jaeger, torvalds, linux-kernel
> Anon! But there is something uber-ugly about constantly jamming more
> and more stuff into procfs without thinking or planning long term... I
> vote for the non-procfs approach :)
At some point I had done a port of SGI's pset/sysmp interface to linux 2.2.
As far as I know, lots of people are still using it. I haven't ported it
to 2.4 for various reasons, but I have to say - IT IS A MUCH BETTER
INTERFACE than all these ad-hoc cpus_allowed bits.
If I thought that it had a chance of inclusion, maybe I'd port it up, but
last I heard none of the "core" people wanted it.
If we are going to pick an affinity system, please, let's consider sysmp().
Tim
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] syscall interface for cpu affinity
@ 2002-03-10 18:15 Robert Love
2002-03-10 20:29 ` Andreas Jaeger
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Robert Love @ 2002-03-10 18:15 UTC (permalink / raw)
To: torvalds; +Cc: linux-kernel
Linus,
I have updated the patch a bit and resycned to 2.5.6. Are you
interested? I believe a user interface for setting task CPU affinity is
useful and completes the rest of our sched_* syscalls. A syscall
implementation seems to be what everyone wants (I have a proc-interface,
too...)
This patch implements
int sched_set_affinity(pid_t pid, unsigned int len,
unsigned long *new_mask_ptr);
int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
unsigned long *user_mask_ptr)
which set and get the cpu affinity (task->cpus_allowed) for a task,
using the set_cpus_allowed function in Ingo's scheduler. The functions
properly support changes to cpus_allowed, implement security, and are
well-tested.
They are based on Ingo's older affinity syscall patch and my older
affinity proc patch.
Comments?
Robert Love
diff -urN linux-2.5.6/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
--- linux-2.5.6/arch/i386/kernel/entry.S Thu Mar 7 21:18:19 2002
+++ linux/arch/i386/kernel/entry.S Sun Mar 10 13:01:03 2002
@@ -717,6 +717,8 @@
.long SYMBOL_NAME(sys_fremovexattr)
.long SYMBOL_NAME(sys_tkill)
.long SYMBOL_NAME(sys_sendfile64)
+ .long SYMBOL_NAME(sys_sched_set_affinity) /* 240 */
+ .long SYMBOL_NAME(sys_sched_get_affinity)
.rept NR_syscalls-(.-sys_call_table)/4
.long SYMBOL_NAME(sys_ni_syscall)
diff -urN linux-2.5.6/include/asm-i386/unistd.h linux/include/asm-i386/unistd.h
--- linux-2.5.6/include/asm-i386/unistd.h Thu Mar 7 21:18:55 2002
+++ linux/include/asm-i386/unistd.h Sun Mar 10 13:03:41 2002
@@ -244,6 +244,8 @@
#define __NR_fremovexattr 237
#define __NR_tkill 238
#define __NR_sendfile64 239
+#define __NR_sched_set_affinity 240
+#define __NR_sched_get_affinity 241
/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
diff -urN linux-2.5.6/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.6/kernel/sched.c Thu Mar 7 21:18:19 2002
+++ linux/kernel/sched.c Sun Mar 10 12:59:26 2002
@@ -1215,6 +1215,95 @@
return retval;
}
+/**
+ * sys_sched_set_affinity - set the cpu affinity of a process
+ * @pid: pid of the process
+ * @len: length of new_mask
+ * @new_mask: user-space pointer to the new cpu mask
+ */
+asmlinkage int sys_sched_set_affinity(pid_t pid, unsigned int len,
+ unsigned long *new_mask_ptr)
+{
+ unsigned long new_mask;
+ task_t *p;
+ int retval;
+
+ if (len < sizeof(new_mask))
+ return -EINVAL;
+
+ if (copy_from_user(&new_mask, new_mask_ptr, sizeof(new_mask)))
+ return -EFAULT;
+
+ new_mask &= cpu_online_map;
+ if (!new_mask)
+ return -EINVAL;
+
+ read_lock(&tasklist_lock);
+
+ retval = -ESRCH;
+ p = find_process_by_pid(pid);
+ if (!p)
+ goto out_unlock;
+
+ retval = -EPERM;
+ if ((current->euid != p->euid) && (current->euid != p->uid) &&
+ !capable(CAP_SYS_NICE))
+ goto out_unlock;
+
+ retval = 0;
+#ifdef CONFIG_SMP
+ set_cpus_allowed(p, new_mask);
+#endif
+
+out_unlock:
+ read_unlock(&tasklist_lock);
+ return retval;
+}
+
+/**
+ * sys_sched_get_affinity - get the cpu affinity of a process
+ * @pid: pid of the process
+ * @user_len_ptr: userspace pointer to the length of the mask
+ * @user_mask_ptr: userspace pointer to the mask
+ */
+asmlinkage int sys_sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
+ unsigned long *user_mask_ptr)
+{
+ unsigned long mask;
+ unsigned int len, user_len;
+ task_t *p;
+ int retval;
+
+ len = sizeof(mask);
+
+ if (copy_from_user(&user_len, user_len_ptr, sizeof(user_len)))
+ return -EFAULT;
+
+ if (copy_to_user(user_len_ptr, &len, sizeof(len)))
+ return -EFAULT;
+
+ if (user_len < len)
+ return -EINVAL;
+
+ read_lock(&tasklist_lock);
+
+ retval = -ESRCH;
+ p = find_process_by_pid(pid);
+ if (!p)
+ goto out_unlock;
+
+ retval = 0;
+ mask = p->cpus_allowed & cpu_online_map;
+
+out_unlock:
+ read_unlock(&tasklist_lock);
+ if (retval)
+ return retval;
+ if (copy_to_user(user_mask_ptr, &mask, sizeof(mask)))
+ return -EFAULT;
+ return 0;
+}
+
asmlinkage long sys_sched_yield(void)
{
runqueue_t *rq;
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love
@ 2002-03-10 20:29 ` Andreas Jaeger
2002-03-10 20:53 ` Robert Love
2002-03-10 22:05 ` Chris Wedgwood
2002-03-11 0:38 ` Andreas Ferber
2 siblings, 1 reply; 16+ messages in thread
From: Andreas Jaeger @ 2002-03-10 20:29 UTC (permalink / raw)
To: Robert Love; +Cc: torvalds, linux-kernel
Robert Love <rml@tech9.net> writes:
> Linus,
>
> I have updated the patch a bit and resycned to 2.5.6. Are you
> interested? I believe a user interface for setting task CPU affinity is
> useful and completes the rest of our sched_* syscalls. A syscall
> implementation seems to be what everyone wants (I have a proc-interface,
> too...)
Please add the procinterface also! I've found it today (for 2.4.18)
and it's much easier to use with existing programs.
Andreas
> This patch implements
>
> int sched_set_affinity(pid_t pid, unsigned int len,
> unsigned long *new_mask_ptr);
>
> int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
> unsigned long *user_mask_ptr)
>
> which set and get the cpu affinity (task->cpus_allowed) for a task,
> using the set_cpus_allowed function in Ingo's scheduler. The functions
> properly support changes to cpus_allowed, implement security, and are
> well-tested.
>
> They are based on Ingo's older affinity syscall patch and my older
> affinity proc patch.
>
> Comments?
Please add it for all archs - this is not only interesting for x86,
Andreas
[...]
--
Andreas Jaeger
SuSE Labs aj@suse.de
private aj@arthur.inka.de
http://www.suse.de/~aj
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 20:29 ` Andreas Jaeger
@ 2002-03-10 20:53 ` Robert Love
2002-03-10 21:03 ` Andreas Jaeger
2002-03-10 23:45 ` Jeff Garzik
0 siblings, 2 replies; 16+ messages in thread
From: Robert Love @ 2002-03-10 20:53 UTC (permalink / raw)
To: Andreas Jaeger; +Cc: torvalds, linux-kernel
On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote:
> Please add the procinterface also! I've found it today (for 2.4.18)
> and it's much easier to use with existing programs.
I agree and I really like the proc-interface. There is something uber
cool about:
cat 1 > /proc/pid/affinity
I have a patch for 2.5.6 for proc-based affinity interface here:
http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch
I suspect, however, that despite both patches being small we really only
want to pick and standardize on one. The syscall interface has two main
things going for it against a proc-based implementation: it is faster
and /proc may not be mounted. The masses have spoken on this issue.
Note you can use the syscall interface with existing programs, too.
Just write a program to take in a pid and mask and call
sched_set_affinity.
> Please add it for all archs - this is not only interesting for x86,
I'll send Linus the patch for other arches if/when he accepts this patch
- I have no problem with that.
Robert Love
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 20:53 ` Robert Love
@ 2002-03-10 21:03 ` Andreas Jaeger
2002-03-10 22:23 ` Andreas Schwab
2002-03-10 23:56 ` Andreas Ferber
2002-03-10 23:45 ` Jeff Garzik
1 sibling, 2 replies; 16+ messages in thread
From: Andreas Jaeger @ 2002-03-10 21:03 UTC (permalink / raw)
To: Robert Love; +Cc: torvalds, linux-kernel
Robert Love <rml@tech9.net> writes:
> On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote:
>
>> Please add the procinterface also! I've found it today (for 2.4.18)
>> and it's much easier to use with existing programs.
>
> I agree and I really like the proc-interface. There is something uber
> cool about:
>
> cat 1 > /proc/pid/affinity
I agree.
> I have a patch for 2.5.6 for proc-based affinity interface here:
>
> http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch
>
> I suspect, however, that despite both patches being small we really only
> want to pick and standardize on one. The syscall interface has two main
> things going for it against a proc-based implementation: it is faster
> and /proc may not be mounted. The masses have spoken on this issue.
>
> Note you can use the syscall interface with existing programs, too.
> Just write a program to take in a pid and mask and call
> sched_set_affinity.
What I need at the moment is a wrapper - and you can do it two ways:
$ run_with_affinity 1 program arguments...
$ (cat 1 > /proc/self/affinity; program arguments...)
The second one is much easier coded ;-)
>> Please add it for all archs - this is not only interesting for x86,
>
> I'll send Linus the patch for other arches if/when he accepts this patch
> - I have no problem with that.
Thanks,
Andreas
--
Andreas Jaeger
SuSE Labs aj@suse.de
private aj@arthur.inka.de
http://www.suse.de/~aj
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love
2002-03-10 20:29 ` Andreas Jaeger
@ 2002-03-10 22:05 ` Chris Wedgwood
2002-03-10 22:11 ` Robert Love
2002-03-11 0:38 ` Andreas Ferber
2 siblings, 1 reply; 16+ messages in thread
From: Chris Wedgwood @ 2002-03-10 22:05 UTC (permalink / raw)
To: Robert Love; +Cc: torvalds, linux-kernel
On Sun, Mar 10, 2002 at 01:15:03PM -0500, Robert Love wrote:
I have updated the patch a bit and resycned to 2.5.6. Are you
interested? I believe a user interface for setting task CPU
affinity is useful and completes the rest of our sched_* syscalls.
A syscall implementation seems to be what everyone wants (I have a
proc-interface, too...)
Can't wer just copy the IRIX interface here as some other pathces have
in the past?
--cw
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 22:05 ` Chris Wedgwood
@ 2002-03-10 22:11 ` Robert Love
0 siblings, 0 replies; 16+ messages in thread
From: Robert Love @ 2002-03-10 22:11 UTC (permalink / raw)
To: Chris Wedgwood; +Cc: torvalds, linux-kernel
On Sun, 2002-03-10 at 17:05, Chris Wedgwood wrote:
> Can't wer just copy the IRIX interface here as some other pathces have
> in the past?
Is that psets? If so, no thanks.
I want a simple, clean, quick implementation. I have seen patches that
do a lot more than what my simple implementation does, and that really
does not interest me and I suspect Ingo and others feel the same way.
Setting a simple per-task bitmask that is inherited is all we need.
Linux scheduler API is already our own standard. I'd rather support
that (i.e. add another simple sched_* call) than some evil other
interface - but that is just me.
Robert Love
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 21:03 ` Andreas Jaeger
@ 2002-03-10 22:23 ` Andreas Schwab
2002-03-10 23:56 ` Andreas Ferber
1 sibling, 0 replies; 16+ messages in thread
From: Andreas Schwab @ 2002-03-10 22:23 UTC (permalink / raw)
To: Andreas Jaeger; +Cc: Robert Love, torvalds, linux-kernel
Andreas Jaeger <aj@suse.de> writes:
|> What I need at the moment is a wrapper - and you can do it two ways:
|>
|> $ run_with_affinity 1 program arguments...
|> $ (cat 1 > /proc/self/affinity; program arguments...)
|>
|> The second one is much easier coded ;-)
Apparently not, since that should be
$ (echo 1 > /proc/self/affinity; program arguments...)
:-)
Andreas.
--
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE GmbH, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 20:53 ` Robert Love
2002-03-10 21:03 ` Andreas Jaeger
@ 2002-03-10 23:45 ` Jeff Garzik
1976-03-03 15:58 ` Tim Hockin
1 sibling, 1 reply; 16+ messages in thread
From: Jeff Garzik @ 2002-03-10 23:45 UTC (permalink / raw)
To: Robert Love; +Cc: Andreas Jaeger, torvalds, linux-kernel
Robert Love wrote:
>
> On Sun, 2002-03-10 at 15:29, Andreas Jaeger wrote:
>
> > Please add the procinterface also! I've found it today (for 2.4.18)
> > and it's much easier to use with existing programs.
>
> I agree and I really like the proc-interface. There is something uber
> cool about:
>
> cat 1 > /proc/pid/affinity
>
> I have a patch for 2.5.6 for proc-based affinity interface here:
>
> http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/v2.5/cpu-affinity-proc-rml-2.5.6-1.patch
Anon! But there is something uber-ugly about constantly jamming more
and more stuff into procfs without thinking or planning long term... I
vote for the non-procfs approach :)
--
Jeff Garzik | Usenet Rule #2 (John Gilmore): "The Net interprets
Building 1024 | censorship as damage and routes around it."
MandrakeSoft |
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 21:03 ` Andreas Jaeger
2002-03-10 22:23 ` Andreas Schwab
@ 2002-03-10 23:56 ` Andreas Ferber
1 sibling, 0 replies; 16+ messages in thread
From: Andreas Ferber @ 2002-03-10 23:56 UTC (permalink / raw)
To: Andreas Jaeger; +Cc: Robert Love, linux-kernel
On Sun, Mar 10, 2002 at 10:03:02PM +0100, Andreas Jaeger wrote:
> >
> > Note you can use the syscall interface with existing programs, too.
> > Just write a program to take in a pid and mask and call
> > sched_set_affinity.
> What I need at the moment is a wrapper - and you can do it two ways:
>
> $ run_with_affinity 1 program arguments...
> $ (cat 1 > /proc/self/affinity; program arguments...)
>
> The second one is much easier coded ;-)
$ (set_affinity 1; program arguments...)
set_affinity just calls sched_set_affinity(getppid()), and everything
is fine (and even shorter to type) :-)
Andreas
--
Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG
---------------------------------------------------------
+49 521 1365800 - af@devcon.net - www.devcon.net
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
1976-03-03 15:58 ` Tim Hockin
@ 2002-03-11 0:08 ` Jeff Garzik
2002-03-11 0:32 ` Tim Hockin
0 siblings, 1 reply; 16+ messages in thread
From: Jeff Garzik @ 2002-03-11 0:08 UTC (permalink / raw)
To: Tim Hockin; +Cc: Robert Love, Andreas Jaeger, torvalds, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 642 bytes --]
Tim Hockin wrote:
> If we are going to pick an affinity system, please, let's consider sysmp().
Not too bad. I picked a random sysmp(2) man page off the net (attached
for ease of other's reference).
It duplicates some stuff set elsewhere, and seems more than a bit like
ioctl(2) by another name, but doesn't seem too bad. Note we should be
careful not to overengineer the interface, either...
Just setting a bitmask does seem a bit limiting when thinking about the
future, agreed.
--
Jeff Garzik | Usenet Rule #2 (John Gilmore): "The Net interprets
Building 1024 | censorship as damage and routes around it."
MandrakeSoft |
[-- Attachment #2: sysmp.man.txt --]
[-- Type: text/plain, Size: 10690 bytes --]
sysmp - multiprocessing control
C SYNOPSIS
#include <sys/types.h>
#include <sys/sysmp.h>
#include <sys/sysinfo.h> /* for SAGET and MINFO structures */
int sysmp (int cmd, ...);
ptrdiff_t sysmp (int cmd, ...);"
DESCRIPTION
sysmp provides control/information for miscellaneous system services.
This system call is usually used by system programs and is not intended
for general use. The arguments arg1, arg2, arg3, arg4 are provided for
command-dependent use.
As specified by cmd, the following commands are available:
MP_CLEARCFSSTAT
MP_CLEARNFSSTAT
MP_NUMA_GETCPUNODEMAP
MP_NUMA_GETDISTMATRIX
These are all interfaces that are used to implement
various system library functions. They are all subject to
change and should not be called directly by applications.
MP_PGSIZE The page size of the system is returned (see
getpagesize(2)).
MP_SCHED Interface for the schedctl(2) system call.
MP_NPROCS Returns the number of processors physically configured.
MP_NAPROCS Returns the number of processors that are available to
schedule unrestricted processes.
MP_STAT The processor ids and status flag bits of the physically
configured processors are copied into an array of pda_stat
structures to which arg1 points. The array must be large
enough to hold as many pda_stat structures as the number
of processors returned by the MP_NPROCS sysmp command.
The pda_stat structure and the various status bits are
defined in <sys/pda.h>.
MP_EMPOWER The processor number given by arg1, interpreted as an
'int', is empowered to run any unrestricted processes.
This is the default for all processors. This command
requires superuser authority.
MP_RESTRICT The processor number given by arg1, interpreted as an
'int', is restricted from running any processes except
those assigned to it by a MP_MUSTRUN or MP_MUSTRUN_PID
command, a runon(1) command or because of hardware
necessity. Note that processor 0 cannot be restricted.
This command requires superuser authority. On Challenge
Series machines, all timers belonging to the processor are
moved to the processor that owns the clock as reported by
MP_CLOCK.
MP_ISOLATE The processor number given by arg1, interpreted as an
'int', is isolated from running any processes except those
assigned to it by a MP_MUSTRUN command, a runon(1) command
or because of hardware necessity. Instruction cache and
Translation Lookaside Buffer synchronization across
processors in the system is minimized or delayed on an
isolated processor until system services are requested.
Note that processor 0 cannot be isolated. This command
requires superuser authority. On Challenge Series
machines, all timers belonging to the processor are moved
to the processor that owns the clock as reported by
MP_CLOCK.
MP_UNISOLATE The processor number given by arg1, interpreted as an
'int', is unisolated and empowered to run any unrestricted
processes. This is the default system configuration for
all processors. This command requires superuser
authority.
MP_PREEMPTIVE The processor number given by arg1, interpreted as an
'int', has its clock scheduler enabled. This is the
default for all processors. This command requires
superuser authority.
MP_NONPREEMPTIVE
The processor number given by arg1, interpreted as an
'int', has its clock scheduler disabled. Normal process
time slicing is no longer enforced on that processor. As
a result of turning off the clock interrupt, the interrupt
latency on this processor will be lower. This command
requires superuser authority and is allowed only on an
isolated processor. This command is not allowed on the
clock processor (see MP_CLOCK).
MP_CLOCK The processor number given by arg1, interpreted as an
'int', is given charge of the operating system software
clock (see timers(5)). This command requires superuser
authority.
MP_FASTCLOCK The processor number given by arg1, interpreted as an
'int', is given charge of the operating system software
fast clock (see timers(5)). This command requires
superuser authority.
MP_MISER_GETREQUEST
MP_MISER_SENDREQUEST
MP_MISER_RESPOND
MP_MISER_GETRESOURCE
MP_MISER_SETRESOURCE
MP_MISER_CHECKACCESS
These are all interfaces that are used to implement
various miser(1) functions. These are all subject to
change and should not be called directly by applications.
MP_MUSTRUN Assigns the calling process to run only on the processor
number by arg1, interpreted as an 'int', except as
required for communications with hardware devices. A
process that has allocated a CC sync register (see
ccsync(7m)) is restricted to running on a particular cpu.
Attempts to reassign such a process to another cpu will
fail until the CC sync register has been relinquished.
MP_MUSTRUN_PID Assigns the process specified by arg2 to run only on the
processor number specified by arg1, both interpreted as
'int', except as required for communications with hardware
devices. A process that has allocated a CC sync register
(see ccsync(7m)) is restricted to running on a particular
cpu. Attempts to reassign such a process to another cpu
will fail until the CC sync register has been
relinquished.
MP_GETMUSTRUN Returns the processor the current process has been set to
run on using the MP_MUSTRUN command. If the current
process has not been assigned to a specific processor, -1
is returned and errno is set to EINVAL.
MP_GETMUSTRUN_PID
Returns the processor that the process specified by arg1
has been set to run on using the MP_MUSTRUN or
MP_MUSTRUN_PID command. If the process has not been
assigned to a specific processor, -1 is returned and errno
is set to EINVAL.
MP_RUNANYWHERE Frees the calling process to run on whatever processor the
system deems suitable.
MP_RUNANYWHERE_PID
Frees the process specified by arg1 to run on whatever
processor the system deems suitable.
MP_KERNADDR Returns the address of various kernel data structures.
The structure returned is selected by arg1. The list of
available structures is detailed in <sys/sysmp.h>. This
option is used by many system programs to avoid having to
look in /unix for the location of the data structures.
MP_SASZ Returns the size of various system accounting structures.
As above, the structure returned is governed by arg1.
MP_SAGET1 Returns the contents of various system accounting
structures. The information is only for the processor
specified by arg4. As above, the structure returned is
governed by arg1. arg2 points to a buffer in the address
space of the calling process and arg3 specifies the
maximum number of bytes to transfer.
MP_SAGET Returns the contents of various system accounting
structures. The information is summed across all
processors before it is returned. As above, the structure
returned is governed by arg1. arg2 points to a buffer in
the address space of the calling process and arg3
specifies the maximum number of bytes to transfer.
Possible errors from sysmp are:
[EPERM] The effective user ID is not superuser. Many of the commands
require superuser privilege.
[EPERM] The user ID of the sending process is not superuser, and its
real or effective user ID does not match the real, saved, or
effective user ID of the receiving process.
[ESRCH] No process corresponding to that specified by a
MP_MUSTRUN_PID, MP_GETMUSTRUN_PID, or MP_RUNANYWHERE_PID
could be found.
[EINVAL] The processor named by a MP_EMPOWER, MP_RESTRICT, MP_CLOCK or
MP_SAGET1 command does not exist.
[EINVAL] The cmd argument is invalid.
[EINVAL] The arg1 argument to a MP_KERNADDR command is invalid.
[EINVAL] An attempt was made via MP_MUSTRUN or MP_MUSTRUN_PID to move
a process owning a CC sync register from the cpu controlling
the CC sync register.
[EINVAL] The target of the MP_GETMUSTRUN command has not been set to
run on a specific processor.
[EBUSY] An attempt was made to restrict the only unrestricted
processor or to restrict the master processor.
[EFAULT] An invalid buffer address has been supplied by the calling
process.
SEE ALSO
mpadmin(1), runon(1), getpagesize(2), schedctl(2), timers(5)
DIAGNOSTICS
Upon successful completion, the cmd dependent data is returned.
Otherwise, a value of -1 is returned and errno is set to indicate the
error.
[-- Attachment #3: mpadmin.man.txt --]
[-- Type: text/plain, Size: 5437 bytes --]
mpadmin(1) mpadmin(1)
NAME
mpadmin - control and report processor status
SYNOPSIS
mpadmin -n
mpadmin -u[processor]
mpadmin -r[processor]
mpadmin -c[processor]
mpadmin -f[processor]
mpadmin -I[processor]
mpadmin -U[processor]
mpadmin -D[processor]
mpadmin -C[processor]
mpadmin -s
DESCRIPTION
mpadmin provides control/information of processor status.
Exactly one argument is accepted by mpadmin at each invocation. The
following arguments are accepted:
-n Report which processors are physically configured. The
numbers of the physically configured processors are written
to the standard output, one processor number per line.
Processors are numbered beginning from 0.
-u[processor]
When no processor is specified, the numbers of the
processors that are available to schedule unrestricted
processes are written to the standard output. Otherwise,
mpadmin enables the processor number processor to run any
unrestricted processes.
-r[processor]
When no processor is specified, the numbers of the
processors that are restricted from running any processes
(except those assigned via the sysmp(MP_MUSTRUN) function,
the runon(1) command, or because of hardware necessity) are
written to the standard output. Otherwise, mpadmin
restricts the processor numbered processor.
-c[processor]
When no processor is specified, the number of the processor
that handles the operating system software clock is written
to the standard output. Otherwise, operating system
software clock handling is moved to the processor numbered
processor. See timers(5) for more details.
-f[processor]
When no processor is specified, the number of the processor
that handles the operating system fast clock is written to
the standard output. Otherwise, operating system fast clock
handling is moved to the processor numbered processor. See
ftimer(1) and timers(5) for a description of the fast clock
usage.
-I[processor]
When no processor is specified, the numbers of the
processors that are isolated are written to the standard
output. Otherwise, mpadmin isolates the processor numbered
processor. An isolated processor is restricted as by the -r
argument. In addition, instruction cache and Translation
Lookaside Buffer synchronization are blocked, and
synchronization is delayed until a system service is
requested.
-U[processor]
When no processor is specified, the numbers of the
processors that are not isolated are written to the standard
output. Otherwise, mpadmin unisolates the processor
numbered processor.
-D[processor]
When no processor is specified, the numbers of the
processors that are not running the clock scheduler are
written to the standard output. Otherwise, mpadmin disables
the clock scheduler on the processor numbered processor.
This makes that processor nonpreemptive, so that normal IRIX
process time slicing is no longer enforced. Processes that
run on a non-preemptive processor are not preempted because
of timer interrupts. They are preempted only when
requesting a system service that causes them to wait, or
that makes a higher-priority process runnable (for example,
posting a semaphore).
-C[processor]
When no processor is specified, the numbers of the
processors that are running the clock scheduler are written
to the standard output. Otherwise, mpadmin enables the
clock scheduler on the processor numbered processor.
Processes on a preemptive processor can be preempted at the
end of their time slice.
-s A summary of the unrestricted, restricted, isolated,
preemptive and clock processor numbers is written to the
standard output.
SEE ALSO
ftimer(1), runon(1), sysmp(2), timers(5).
DIAGNOSTICS
When an argument specifies a processor, 0 is returned on success, -1 on
failure. Otherwise, the number of processors associated with argument is
returned.
WARNINGS
It is not possible to restrict or isolate all processors. Processor 0
must never be restricted or isolated.
BUGS
Changing the clock processor may cause the system to lose a small amount
of system time.
When a processor is not provided as an argument, mpadmin's exit value
will not exceed 255. If more than 255 processors exist, mpadmin will
return 0.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-11 0:08 ` Jeff Garzik
@ 2002-03-11 0:32 ` Tim Hockin
0 siblings, 0 replies; 16+ messages in thread
From: Tim Hockin @ 2002-03-11 0:32 UTC (permalink / raw)
To: Jeff Garzik
Cc: Tim Hockin, Robert Love, Andreas Jaeger, torvalds, linux-kernel
> > If we are going to pick an affinity system, please, let's consider sysmp().
>
> Not too bad. I picked a random sysmp(2) man page off the net (attached
> for ease of other's reference).
so, there are actually two parts to sysmp(). The Way SGI used to it is
with Pset (MP_PSET to sysmp()). They seem to have dropped exported support
for PSets - don't know why. The idea is this.
At boot the system creates a PSet with ALL processors, and one set for each
single CPU. Root can define extra sets with specified CPUs, too.
Processes can then run (commandline tool = 'runon') on a specific Pset.
runon 3 yes # runs on PSET #3
This is ok, but it has several drawbacks:
* user can not run on an arbitrary set of procs
* defining a set for every combination of procs is ludicrous
However, it has several upsides
* disabling a CPU is as simple as removing it from a pset struct, not
iterating over all tasks
* conceptually hides the 'bitmask of CPUs'
> It duplicates some stuff set elsewhere, and seems more than a bit like
> ioctl(2) by another name, but doesn't seem too bad. Note we should be
> careful not to overengineer the interface, either...
At some point Ralf Baechle asked me to extend it more for IRIX
compatibility. We may want to just drop that altogether. Several of the
sysmp() interfaces can be handled at the library layer and re-routed to
their existing interfaces.
> Just setting a bitmask does seem a bit limiting when thinking about the
> future, agreed.
What is the future of the existing CPUs bitmask? Is it becoming something
else?
Perhaps we want to keep sysmp() in name and form, perhaps just in name,
perhaps not at all. This is an area in which I have (had, but could get
again) a lot of interest, but before I waste any more time on it, I'd like
to actually co-design a feature set.
What do we want:
* unpriviliged ability to change current->pset?
- any user can call sysmp(MP_RUNON) anytime
* privileged ability only (runon becomes suid)
- can "trap" processes to a CPU - it has been requested a lot
* processor sets or just bitmasks/lists?
- someone was working on memory sets, similarly to psets
If we really want this, I definately want to help. :)
Tim
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love
2002-03-10 20:29 ` Andreas Jaeger
2002-03-10 22:05 ` Chris Wedgwood
@ 2002-03-11 0:38 ` Andreas Ferber
2002-03-15 22:06 ` Stephen Samuel
2 siblings, 1 reply; 16+ messages in thread
From: Andreas Ferber @ 2002-03-11 0:38 UTC (permalink / raw)
To: Robert Love; +Cc: torvalds, linux-kernel
On Sun, Mar 10, 2002 at 01:15:03PM -0500, Robert Love wrote:
>
> This patch implements
>
> int sched_set_affinity(pid_t pid, unsigned int len,
> unsigned long *new_mask_ptr);
>
> int sched_get_affinity(pid_t pid, unsigned int *user_len_ptr,
> unsigned long *user_mask_ptr)
>
> which set and get the cpu affinity (task->cpus_allowed) for a task,
> using the set_cpus_allowed function in Ingo's scheduler. The functions
> properly support changes to cpus_allowed, implement security, and are
> well-tested.
Setting the affinity of a whole process group also makes sense IMHO.
Therefore I think an interface more like the setpriority syscall
for sched_set_affinity (with two parameters which/who instead of a
single PID) would be more flexible, eg.
int sched_set_affinity(int which, int who, unsigned int len,
unsigned long *new_mask_ptr);
with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according
to the value of who.
Getting the mask of a group of processes doesn't make sense though
(what if they differ?), so the current interface of sched_get_affinity
is just fine IMHO.
Andreas
--
Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG
---------------------------------------------------------
+49 521 1365800 - af@devcon.net - www.devcon.net
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-11 0:38 ` Andreas Ferber
@ 2002-03-15 22:06 ` Stephen Samuel
2002-03-16 0:43 ` Andreas Ferber
0 siblings, 1 reply; 16+ messages in thread
From: Stephen Samuel @ 2002-03-15 22:06 UTC (permalink / raw)
To: Andreas Ferber; +Cc: Robert Love, torvalds, linux-kernel
Picking nits, but....
Andreas Ferber wrote:
> Setting the affinity of a whole process group also makes sense IMHO.
> Therefore I think an interface more like the setpriority syscall
> for sched_set_affinity (with two parameters which/who instead of a
> single PID) would be more flexible, eg.
>
> int sched_set_affinity(int which, int who, unsigned int len,
> unsigned long *new_mask_ptr);
>
> with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according
> to the value of who.
I soule suggest that the order be
int sched_set_affinity(int who, int which, unsigned int len,
unsigned long *new_mask_ptr);
This would have the {p,pg}id be the first thing that a programmer
would see (likely more important than the 'which'.).
--
Stephen Samuel +1(604)876-0426 samuel@bcgreen.com
http://www.bcgreen.com/~samuel/
Powerful committed communication, reaching through fear, uncertainty and
doubt to touch the jewel within each person and bring it to life.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-15 22:06 ` Stephen Samuel
@ 2002-03-16 0:43 ` Andreas Ferber
2002-03-16 4:24 ` Stephen Samuel
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Ferber @ 2002-03-16 0:43 UTC (permalink / raw)
To: Stephen Samuel; +Cc: Robert Love, torvalds, linux-kernel
On Fri, Mar 15, 2002 at 02:06:04PM -0800, Stephen Samuel wrote:
> >
> > int sched_set_affinity(int which, int who, unsigned int len,
> > unsigned long *new_mask_ptr);
> >
> > with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according
> > to the value of who.
Uh, who/which should be just the other way round in the description
(but not in the prototype). Sorry.
> I soule suggest that the order be
>
> int sched_set_affinity(int who, int which, unsigned int len,
> unsigned long *new_mask_ptr);
>
> This would have the {p,pg}id be the first thing that a programmer
> would see (likely more important than the 'which'.).
See my correction above, does that address your concern?
Andreas
--
Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG
---------------------------------------------------------
+49 521 1365800 - af@devcon.net - www.devcon.net
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] syscall interface for cpu affinity
2002-03-16 0:43 ` Andreas Ferber
@ 2002-03-16 4:24 ` Stephen Samuel
0 siblings, 0 replies; 16+ messages in thread
From: Stephen Samuel @ 2002-03-16 4:24 UTC (permalink / raw)
To: Andreas Ferber; +Cc: Robert Love, torvalds, linux-kernel
Almost... Same effect (mostly)...
It does, however, leaves us arguing the linguistic semantics of
which name 'who' should have. It seems to me that the most
natural would be with 'who' being the 'name' of the target, and
'which' specifying which name space 'who' is operating in.
UGH: messing with these names via pronouns is too confusing:
-----------
How about this:
int sched_set_affinity(int who, int which, unsigned int len,
unsigned long *new_mask_ptr);
'who' being a {process, process-group or user } ID , and
with 'which' being one of {PRIO_PROCESS, PRIO_PGRP, PRIO_USER},
respectively -- specifying which namespace 'who' operates in.
I think that that is what you were trying to say, right?
Andreas Ferber wrote:
> On Fri, Mar 15, 2002 at 02:06:04PM -0800, Stephen Samuel wrote:
>
>> >
>> > int sched_set_affinity(int which, int who, unsigned int len,
>> > unsigned long *new_mask_ptr);
>> >
>> > with who one of {PRIO_PROCESS,PRIO_PGRP,PRIO_USER} and which according
>> > to the value of who.
>>
>
> Uh, who/which should be just the other way round in the description
> (but not in the prototype). Sorry.
>
>
>>I sould suggest that the order be
>>
>>int sched_set_affinity(int who, int which, unsigned int len,
>> unsigned long *new_mask_ptr);
>>
>>This would have the {p,pg}id be the first thing that a programmer
>>would see (likely more important than the 'which'.).
--
Stephen Samuel +1(604)876-0426 samuel@bcgreen.com
http://www.bcgreen.com/~samuel/
Powerful committed communication, reaching through fear, uncertainty and
doubt to touch the jewel within each person and bring it to life.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2002-03-16 4:27 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-10 18:15 [PATCH] syscall interface for cpu affinity Robert Love
2002-03-10 20:29 ` Andreas Jaeger
2002-03-10 20:53 ` Robert Love
2002-03-10 21:03 ` Andreas Jaeger
2002-03-10 22:23 ` Andreas Schwab
2002-03-10 23:56 ` Andreas Ferber
2002-03-10 23:45 ` Jeff Garzik
1976-03-03 15:58 ` Tim Hockin
2002-03-11 0:08 ` Jeff Garzik
2002-03-11 0:32 ` Tim Hockin
2002-03-10 22:05 ` Chris Wedgwood
2002-03-10 22:11 ` Robert Love
2002-03-11 0:38 ` Andreas Ferber
2002-03-15 22:06 ` Stephen Samuel
2002-03-16 0:43 ` Andreas Ferber
2002-03-16 4:24 ` Stephen Samuel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox