* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <51EDB378.8070301-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-06-26 14:28 ` Michael Kerrisk (man-pages)
[not found] ` <558D6171.1060901-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-06-26 14:28 UTC (permalink / raw)
To: Carlos O'Donell, Roland McGrath
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, KOSAKI Motohiro, libc-alpha
Carlos,
On 07/23/2013 12:34 AM, Carlos O'Donell wrote:
> On 07/22/2013 05:43 PM, Roland McGrath wrote:
>>> I can fix the glibc manual. A 'configured' CPU is one that the OS
>>> can bring online.
>>
>> Where do you get this definition, in the absence of a standard that
>> specifies _SC_NPROCESSORS_CONF? The only definition I've ever known for
>> _SC_NPROCESSORS_CONF is a value that's constant for at least the life of
>> the process (and probably until reboot) that is the upper bound for what
>> _SC_NPROCESSORS_ONLN might ever report. If the implementation for Linux is
>> inconsistent with that definition, then it's just a bug in the implementation.
>
> Let me reiterate my understanding such that you can help me clarify
> exactly my interpretation of the glibc manual wording regarding the
> two existing constants.
>
> The reality of the situation is that the linux kernel as an abstraction
> presents the following:
>
> (a) The number of online cpus.
> - Changes dynamically.
> - Not constant for the life of the process, but pretty constant.
>
> (b) The number of configured cpus.
> - The number of detected cpus that the OS could access.
> - Some of them may be offline for various reasons.
> - Changes dynamically with hotplug.
>
> (c) The number of possible CPUs the OS or hardware can support.
> - The internal software infrastructure is designed to support at
> most this many cpus.
> - Constant for the uptime of the system.
> - May be tied in some way to the hardware.
>
> On Linux, glibc currently maps _SC_NPROCESSORS_CONF to (b) via
> /sys/devices/system/cpu/cpu*, and _SC_NPROCESSORS_ONLN to (a) via
> /sys/devices/system/cpu/online.
>
> The problem is that sched_getaffinity and sched_setaffinity only cares
> about (c) since the size of the kernel affinity mask is of size (c).
>
> What Motohiro-san was requesting was that the manual should make it clear
> that _SC_NPROCESSORS_CONF is distinct from (c) which is an OS limit that
> the user doesn't know.
>
> We need not expose (c) as a new _SC_* constant since it's not really
> required, since glibc's sched_getaffinity and sched_setaffinity could
> hide the fact that (c) exists from userspace (and that's what I suggest
> should happen).
>
> Does that clarify my statement?
It's a long time since the last activity in this discussion, and I see that
https://sourceware.org/bugzilla/show_bug.cgi?id=15630
remains open, I propose to apply the patch below to the
sched_setattr/sched_getattr man page. Seem okay?
Cheers,
Michael
--- a/man2/sched_setaffinity.2
+++ b/man2/sched_setaffinity.2
@@ -333,6 +334,57 @@ main(int argc, char *argv[])
}
}
.fi
+.SH BUGS
+The glibc
+.BR sched_setaffinity ()
+and
+.BR sched_getaffinity ()
+wrapper functions do not handle systems with more than 1024 CPUs.
+.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
+.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
+The
+.I cpu_set_t
+data type used by glibc has a fixed size of 128 bytes,
+meaning that the the maximum CPU number that can be represented is 1023.
+If the system has more than 1024 CPUs, then:
+.IP * 3
+The
+.BR sched_setaffinity ()
+.I mask
+argument is not capable of representing the excess CPUs.
+.IP *
+Calls of the form:
+
+ sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
+
+will fail with error
+.BR EINVAL ,
+the error produced by the underlying system call for the case where the
+.I mask
+size specified in
+.I cpusetsize
+is smaller than the size of the affinity mask used by the kernel.
+.PP
+The workaround for this problem is to fall back to the use of the
+underlying system call (via
+.BR syscall (2)),
+passing
+.I mask
+arguments of a sufficient size.
+Using a value based on the number of online CPUs:
+
+ (sysconf(_SC_NPROCESSORS_CONF) / (sizeof(unsigned long) * 8) + 1)
+ * sizeof(unsigned long)
+
+is probably sufficient as the size of the mask,
+although the value returned by the
+.BR sysconf ()
+call can in theory change during the lifetime of the process.
+Alternatively, one can probe for the size of the required mask using raw
+.BR sched_getaffinity ()
+system calls with increasing mask sizes
+until the call does not fail with the error
+.BR EINVAL .
.SH SEE ALSO
.ad l
.nh
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <558D6171.1060901-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-26 20:05 ` Michael Kerrisk (man-pages)
[not found] ` <558DB0A0.2040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-01 12:37 ` Florian Weimer
0 siblings, 2 replies; 8+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-06-26 20:05 UTC (permalink / raw)
To: Carlos O'Donell, Roland McGrath
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, KOSAKI Motohiro, libc-alpha,
linux-man-u79uwXL29TY76Z2rM5mHXA
Sigh.... I forgot much of what I learned as I wrote the CPU_SET(3)
page many years ago. Revised patch below.
On 06/26/2015 04:28 PM, Michael Kerrisk (man-pages) wrote:
> Carlos,
>
> On 07/23/2013 12:34 AM, Carlos O'Donell wrote:
>> On 07/22/2013 05:43 PM, Roland McGrath wrote:
>>>> I can fix the glibc manual. A 'configured' CPU is one that the OS
>>>> can bring online.
>>>
>>> Where do you get this definition, in the absence of a standard that
>>> specifies _SC_NPROCESSORS_CONF? The only definition I've ever known for
>>> _SC_NPROCESSORS_CONF is a value that's constant for at least the life of
>>> the process (and probably until reboot) that is the upper bound for what
>>> _SC_NPROCESSORS_ONLN might ever report. If the implementation for Linux is
>>> inconsistent with that definition, then it's just a bug in the implementation.
>>
>> Let me reiterate my understanding such that you can help me clarify
>> exactly my interpretation of the glibc manual wording regarding the
>> two existing constants.
>>
>> The reality of the situation is that the linux kernel as an abstraction
>> presents the following:
>>
>> (a) The number of online cpus.
>> - Changes dynamically.
>> - Not constant for the life of the process, but pretty constant.
>>
>> (b) The number of configured cpus.
>> - The number of detected cpus that the OS could access.
>> - Some of them may be offline for various reasons.
>> - Changes dynamically with hotplug.
>>
>> (c) The number of possible CPUs the OS or hardware can support.
>> - The internal software infrastructure is designed to support at
>> most this many cpus.
>> - Constant for the uptime of the system.
>> - May be tied in some way to the hardware.
>>
>> On Linux, glibc currently maps _SC_NPROCESSORS_CONF to (b) via
>> /sys/devices/system/cpu/cpu*, and _SC_NPROCESSORS_ONLN to (a) via
>> /sys/devices/system/cpu/online.
>>
>> The problem is that sched_getaffinity and sched_setaffinity only cares
>> about (c) since the size of the kernel affinity mask is of size (c).
>>
>> What Motohiro-san was requesting was that the manual should make it clear
>> that _SC_NPROCESSORS_CONF is distinct from (c) which is an OS limit that
>> the user doesn't know.
>>
>> We need not expose (c) as a new _SC_* constant since it's not really
>> required, since glibc's sched_getaffinity and sched_setaffinity could
>> hide the fact that (c) exists from userspace (and that's what I suggest
>> should happen).
>>
>> Does that clarify my statement?
>
> It's a long time since the last activity in this discussion, and I see that
> https://sourceware.org/bugzilla/show_bug.cgi?id=15630
> remains open, I propose to apply the patch below to the
> sched_setattr/sched_getattr man page. Seem okay?
>
> Cheers,
>
> Michael
>
>
> --- a/man2/sched_setaffinity.2
> +++ b/man2/sched_setaffinity.2
> @@ -333,6 +334,57 @@ main(int argc, char *argv[])
> }
> }
> .fi
> +.SH BUGS
> +The glibc
> +.BR sched_setaffinity ()
> +and
> +.BR sched_getaffinity ()
> +wrapper functions do not handle systems with more than 1024 CPUs.
> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
> +The
> +.I cpu_set_t
> +data type used by glibc has a fixed size of 128 bytes,
> +meaning that the the maximum CPU number that can be represented is 1023.
> +If the system has more than 1024 CPUs, then:
> +.IP * 3
> +The
> +.BR sched_setaffinity ()
> +.I mask
> +argument is not capable of representing the excess CPUs.
> +.IP *
> +Calls of the form:
> +
> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
> +
> +will fail with error
> +.BR EINVAL ,
> +the error produced by the underlying system call for the case where the
> +.I mask
> +size specified in
> +.I cpusetsize
> +is smaller than the size of the affinity mask used by the kernel.
> +.PP
> +The workaround for this problem is to fall back to the use of the
> +underlying system call (via
> +.BR syscall (2)),
> +passing
> +.I mask
> +arguments of a sufficient size.
> +Using a value based on the number of online CPUs:
> +
> + (sysconf(_SC_NPROCESSORS_CONF) / (sizeof(unsigned long) * 8) + 1)
> + * sizeof(unsigned long)
> +
> +is probably sufficient as the size of the mask,
> +although the value returned by the
> +.BR sysconf ()
> +call can in theory change during the lifetime of the process.
> +Alternatively, one can probe for the size of the required mask using raw
> +.BR sched_getaffinity ()
> +system calls with increasing mask sizes
> +until the call does not fail with the error
> +.BR EINVAL .
> .SH SEE ALSO
> .ad l
> .nh
Okay -- scratch the above. How about the patch below.
Cheers,
Michael
--- a/man2/sched_setaffinity.2
+++ b/man2/sched_setaffinity.2
@@ -223,6 +223,47 @@ system call returns the size (in bytes) of the
.I cpumask_t
data type that is used internally by the kernel to
represent the CPU set bit mask.
+.SS Handling systems with more than 1024 CPUs
+The
+.I cpu_set_t
+data type used by glibc has a fixed size of 128 bytes,
+meaning that the maximum CPU number that can be represented is 1023.
+.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
+.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
+If the system has more than 1024 CPUs, then calls of the form:
+
+ sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
+
+will fail with the error
+.BR EINVAL ,
+the error produced by the underlying system call for the case where the
+.I mask
+size specified in
+.I cpusetsize
+is smaller than the size of the affinity mask used by the kernel.
+.PP
+The underlying system calls (which represent CPU masks as bit masks of type
+.IR "unsigned long\ *" )
+impose no restriction on the size of the mask.
+To handle systems with more than 1024 CPUs, one must dynamically allocate the
+.I mask
+argument using
+.BR CPU_ALLOC (3)
+and manipulate the mask using the "_S" macros described in
+.BR CPU_ALLOC (3).
+Using an allocation based on the number of online CPUs:
+
+ cpu_set_t *mask = CPU_ALLOC(CPU_ALLOC_SIZE(
+ sysconf(_SC_NPROCESSORS_CONF)));
+
+is probably sufficient, although the value returned by the
+.BR sysconf ()
+call can in theory change during the lifetime of the process.
+Alternatively, one can obtain a value that is guaranteed to be stable for
+the lifetime of the process by proby for the size of the required mask using
+.BR sched_getaffinity ()
+calls with increasing mask sizes until the call does not fail with the error
+.BR EINVAL .
.SH EXAMPLE
The program below creates a child process.
The parent and child then each assign themselves to a specified CPU
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <558DB0A0.2040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-29 21:40 ` Tolga Dalman
[not found] ` <5591BB55.5080605-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Tolga Dalman @ 2015-06-29 21:40 UTC (permalink / raw)
To: Michael Kerrisk (man-pages), Carlos O'Donell, Roland McGrath
Cc: KOSAKI Motohiro, libc-alpha, linux-man-u79uwXL29TY76Z2rM5mHXA
Michael,
given the approach is accepted by Carlos and Roland, I have
some minor textual suggestions for the patch itself.
On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
> --- a/man2/sched_setaffinity.2
> +++ b/man2/sched_setaffinity.2
> @@ -223,6 +223,47 @@ system call returns the size (in bytes) of the
> .I cpumask_t
> data type that is used internally by the kernel to
> represent the CPU set bit mask.
> +.SS Handling systems with more than 1024 CPUs
What if the system has exactly 1024 CPUs ?
Suggestion: systems with 1024 or more CPUs
> +The
> +.I cpu_set_t
> +data type used by glibc has a fixed size of 128 bytes,
> +meaning that the maximum CPU number that can be represented is 1023.
> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
No objection, although I have never really noticed external references
in man-pages (esp. web refs). Shouldn't these be generally avoided ?
(and yes, I have noticed the FIXME)
> +If the system has more than 1024 CPUs, then calls of the form:
1024 or more CPUs.
> +
> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
> +
> +will fail with the error
> +.BR EINVAL ,
> +the error produced by the underlying system call for the case where the
> +.I mask
> +size specified in
> +.I cpusetsize
> +is smaller than the size of the affinity mask used by the kernel.
> +.PP
> +The underlying system calls (which represent CPU masks as bit masks of type
> +.IR "unsigned long\ *" )
> +impose no restriction on the size of the mask.
> +To handle systems with more than 1024 CPUs, one must dynamically allocate the
> +.I mask
> +argument using
> +.BR CPU_ALLOC (3)
I would rewrite the sentence to avoid "one must".
> +and manipulate the mask using the "_S" macros described in
and manipulate the macros ending with "_S" as described in
> +.BR CPU_ALLOC (3).
> +Using an allocation based on the number of online CPUs:
> +
> + cpu_set_t *mask = CPU_ALLOC(CPU_ALLOC_SIZE(
> + sysconf(_SC_NPROCESSORS_CONF)));
> +
> +is probably sufficient, although the value returned by the
> +.BR sysconf ()
> +call can in theory change during the lifetime of the process.
> +Alternatively, one can obtain a value that is guaranteed to be stable for
Like above, I would replace "one can obtain a value" by "a value can be obtained".
> +the lifetime of the process by proby for the size of the required mask using
s/proby/probing/.
> +.BR sched_getaffinity ()
> +calls with increasing mask sizes until the call does not fail with the error
> +.BR EINVAL .
I would replace "until the call does not fail with error ..." by "while the call succeeds".
Also, the sentence too long, IMHO.
Best regards
Tolga Dalman
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
2015-06-26 20:05 ` Michael Kerrisk (man-pages)
[not found] ` <558DB0A0.2040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-07-01 12:37 ` Florian Weimer
[not found] ` <5593DF14.2060804-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
1 sibling, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2015-07-01 12:37 UTC (permalink / raw)
To: Michael Kerrisk (man-pages), Carlos O'Donell, Roland McGrath
Cc: KOSAKI Motohiro, libc-alpha, linux-man
On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
> +.SS Handling systems with more than 1024 CPUs
> +The
> +.I cpu_set_t
> +data type used by glibc has a fixed size of 128 bytes,
> +meaning that the maximum CPU number that can be represented is 1023.
> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
> +If the system has more than 1024 CPUs, then calls of the form:
> +
> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
> +
> +will fail with the error
> +.BR EINVAL ,
> +the error produced by the underlying system call for the case where the
> +.I mask
> +size specified in
> +.I cpusetsize
> +is smaller than the size of the affinity mask used by the kernel.
I think it is best to leave this as unspecified as possible. Kernel
behavior already changed once, and I can imagine it changing again.
Carlos and I tried to get clarification of the future direction of the
kernel interface here:
<https://sourceware.org/ml/libc-alpha/2015-06/msg00210.html>
No reply so far, unless I missed something.
> +.PP
> +The underlying system calls (which represent CPU masks as bit masks of type
> +.IR "unsigned long\ *" )
> +impose no restriction on the size of the mask.
> +To handle systems with more than 1024 CPUs, one must dynamically allocate the
> +.I mask
> +argument using
> +.BR CPU_ALLOC (3)
> +and manipulate the mask using the "_S" macros described in
> +.BR CPU_ALLOC (3).
> +Using an allocation based on the number of online CPUs:
> +
> + cpu_set_t *mask = CPU_ALLOC(CPU_ALLOC_SIZE(
> + sysconf(_SC_NPROCESSORS_CONF)));
I believe this is incorrect in several ways:
CPU_ALLOC uses the raw CPU counts. CPU_ALLOC_SIZE converts from the raw
count to the size in bytes. (This API is misdesigned.)
sysconf(_SC_NPROCESSORS_CONF) is not related to the kernel CPU mask
size, so it is not the correct value.
> +is probably sufficient, although the value returned by the
> +.BR sysconf ()
> +call can in theory change during the lifetime of the process.
> +Alternatively, one can obtain a value that is guaranteed to be stable for
> +the lifetime of the process by proby for the size of the required mask using
> +.BR sched_getaffinity ()
> +calls with increasing mask sizes until the call does not fail with the error
This is the only possible way right now if you do not want to read
sysconf values.
It's also worth noting that the system call and the glibc function have
different return values.
--
Florian Weimer / Red Hat Product Security
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <5593DF14.2060804-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-07-21 15:03 ` Michael Kerrisk (man-pages)
[not found] ` <55AE5F33.3080105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-07-21 15:03 UTC (permalink / raw)
To: Florian Weimer, Carlos O'Donell, Roland McGrath
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, KOSAKI Motohiro, libc-alpha,
linux-man-u79uwXL29TY76Z2rM5mHXA
Hello Florian,
Thanks for your comments, and sorry for the delayed follow-up.
On 07/01/2015 02:37 PM, Florian Weimer wrote:
> On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
>
>> +.SS Handling systems with more than 1024 CPUs
>> +The
>> +.I cpu_set_t
>> +data type used by glibc has a fixed size of 128 bytes,
>> +meaning that the maximum CPU number that can be represented is 1023.
>> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
>> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
>> +If the system has more than 1024 CPUs, then calls of the form:
>> +
>> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>> +
>> +will fail with the error
>> +.BR EINVAL ,
>> +the error produced by the underlying system call for the case where the
>> +.I mask
>> +size specified in
>> +.I cpusetsize
>> +is smaller than the size of the affinity mask used by the kernel.
>
> I think it is best to leave this as unspecified as possible. Kernel
> behavior already changed once, and I can imagine it changing again.
Hmmm. Something needs to be said about what the kernel is doing though.
Otherwise, it's hard to make sense of this subsection. Did you have a
suggested rewording that removes the piece you find problematic?
> Carlos and I tried to get clarification of the future direction of the
> kernel interface here:
>
> <https://sourceware.org/ml/libc-alpha/2015-06/msg00210.html>
>
> No reply so far, unless I missed something.
Okay
>> +.PP
>> +The underlying system calls (which represent CPU masks as bit masks of type
>> +.IR "unsigned long\ *" )
>> +impose no restriction on the size of the mask.
>> +To handle systems with more than 1024 CPUs, one must dynamically allocate the
>> +.I mask
>> +argument using
>> +.BR CPU_ALLOC (3)
>> +and manipulate the mask using the "_S" macros described in
>> +.BR CPU_ALLOC (3).
>> +Using an allocation based on the number of online CPUs:
>> +
>> + cpu_set_t *mask = CPU_ALLOC(CPU_ALLOC_SIZE(
>> + sysconf(_SC_NPROCESSORS_CONF)));
>
> I believe this is incorrect in several ways:
>
> CPU_ALLOC uses the raw CPU counts. CPU_ALLOC_SIZE converts from the raw
> count to the size in bytes. (This API is misdesigned.)
D'oh! Yes, the use of CPU_ALLOC_SIZE() was clearly misguided.
> sysconf(_SC_NPROCESSORS_CONF) is not related to the kernel CPU mask
> size, so it is not the correct value.
Yes, I understand now.
>> +is probably sufficient, although the value returned by the
>> +.BR sysconf ()
>> +call can in theory change during the lifetime of the process.
>> +Alternatively, one can obtain a value that is guaranteed to be stable for
>> +the lifetime of the process by proby for the size of the required mask using
>> +.BR sched_getaffinity ()
>> +calls with increasing mask sizes until the call does not fail with the error
>
> This is the only possible way right now if you do not want to read
> sysconf values.
Okay. I've amended the text to remove the first piece.
> It's also worth noting that the system call and the glibc function have
> different return values.
Yes, I already cover that elsewhere in the page. See the quoted text below.
Okay, so now I have:
C library/kernel differences
This manual page describes the glibc interface for the CPU
affinity calls. The actual system call interface is slightly
different, with the mask being typed as unsigned long *,
reflecting the fact that the underlying implementation of CPU
sets is a simple bit mask. On success, the raw sched_getaffin‐
ity() system call returns the size (in bytes) of the cpumask_t
data type that is used internally by the kernel to represent
the CPU set bit mask.
Handling systems with more than 1024 CPUs
The underlying system calls (which represent CPU masks as bit
masks of type unsigned long *) impose no restriction on the
size of the CPU mask. However, the cpu_set_t data type used by
glibc has a fixed size of 128 bytes, meaning that the maximum
CPU number that can be represented is 1023. If the system has
more than 1024 CPUs, then calls of the form:
sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
will fail with the error EINVAL, the error produced by the
underlying system call for the case where the mask size speci‐
fied in cpusetsize is smaller than the size of the affinity
mask used by the kernel.
When working on systems with more than 1024 CPUs, one must
dynamically allocate the mask argument. Currently, the only
way to do this is by probing for the size of the required mask
using sched_getaffinity() calls with increasing mask sizes
(until the call does not fail with the error EINVAL).
Better?
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <5591BB55.5080605-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-21 15:03 ` Michael Kerrisk (man-pages)
0 siblings, 0 replies; 8+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-07-21 15:03 UTC (permalink / raw)
To: Tolga Dalman, Carlos O'Donell, Roland McGrath
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, KOSAKI Motohiro, libc-alpha,
linux-man-u79uwXL29TY76Z2rM5mHXA
Hello Tolga,
On 06/29/2015 11:40 PM, Tolga Dalman wrote:
> Michael,
>
> given the approach is accepted by Carlos and Roland, I have
> some minor textual suggestions for the patch itself.
>
> On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
>> --- a/man2/sched_setaffinity.2
>> +++ b/man2/sched_setaffinity.2
>> @@ -223,6 +223,47 @@ system call returns the size (in bytes) of the
>> .I cpumask_t
>> data type that is used internally by the kernel to
>> represent the CPU set bit mask.
>> +.SS Handling systems with more than 1024 CPUs
>
> What if the system has exactly 1024 CPUs ?
> Suggestion: systems with 1024 or more CPUs
I think you've missed something here. CPUs are numbered starting at 0.
"more than 1024 CPUs" is correct here, I belive.
>
>> +The
>> +.I cpu_set_t
>> +data type used by glibc has a fixed size of 128 bytes,
>> +meaning that the maximum CPU number that can be represented is 1023.
>> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
>> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
>
> No objection, although I have never really noticed external references
> in man-pages (esp. web refs). Shouldn't these be generally avoided ?
> (and yes, I have noticed the FIXME)
Those pieces are comments in the page source (not rendered by man(1)).
>> +If the system has more than 1024 CPUs, then calls of the form:
>
> 1024 or more CPUs.
See above
>> +
>> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>> +
>> +will fail with the error
>> +.BR EINVAL ,
>> +the error produced by the underlying system call for the case where the
>> +.I mask
>> +size specified in
>> +.I cpusetsize
>> +is smaller than the size of the affinity mask used by the kernel.
>> +.PP
>> +The underlying system calls (which represent CPU masks as bit masks of type
>> +.IR "unsigned long\ *" )
>> +impose no restriction on the size of the mask.
>> +To handle systems with more than 1024 CPUs, one must dynamically allocate the
>> +.I mask
>> +argument using
>> +.BR CPU_ALLOC (3)
>
> I would rewrite the sentence to avoid "one must".
This is a "voice" thing. I personally find "one must" is okay.
>> +and manipulate the mask using the "_S" macros described in
>
> and manipulate the macros ending with "_S" as described in
I think you've misread the text. I think it's okay.
>> +.BR CPU_ALLOC (3).
>> +Using an allocation based on the number of online CPUs:
>> +
>> + cpu_set_t *mask = CPU_ALLOC(CPU_ALLOC_SIZE(
>> + sysconf(_SC_NPROCESSORS_CONF)));
>> +
>> +is probably sufficient, although the value returned by the
>> +.BR sysconf ()
>> +call can in theory change during the lifetime of the process.
>> +Alternatively, one can obtain a value that is guaranteed to be stable for
>
> Like above, I would replace "one can obtain a value" by "a value can be obtained".
See above.
>> +the lifetime of the process by proby for the size of the required mask using
>
> s/proby/probing/.
Thanks--I'd already spotted that one and fixed.
>> +.BR sched_getaffinity ()
>> +calls with increasing mask sizes until the call does not fail with the error
>> +.BR EINVAL .
>
> I would replace "until the call does not fail with error ..." by "while the call succeeds".
I think you've misunderstood the logic here... Take another look at the sentence.
Thanks,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <55AE5F33.3080105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-07-22 16:02 ` Florian Weimer
[not found] ` <55AFBE87.1040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2015-07-22 16:02 UTC (permalink / raw)
To: Michael Kerrisk (man-pages), Carlos O'Donell, Roland McGrath
Cc: KOSAKI Motohiro, libc-alpha, linux-man-u79uwXL29TY76Z2rM5mHXA
On 07/21/2015 05:03 PM, Michael Kerrisk (man-pages) wrote:
> Hello Florian,
>
> Thanks for your comments, and sorry for the delayed follow-up.
>
> On 07/01/2015 02:37 PM, Florian Weimer wrote:
>> On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
>>
>>> +.SS Handling systems with more than 1024 CPUs
>>> +The
>>> +.I cpu_set_t
>>> +data type used by glibc has a fixed size of 128 bytes,
>>> +meaning that the maximum CPU number that can be represented is 1023.
>>> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
>>> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
>>> +If the system has more than 1024 CPUs, then calls of the form:
>>> +
>>> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>>> +
>>> +will fail with the error
>>> +.BR EINVAL ,
>>> +the error produced by the underlying system call for the case where the
>>> +.I mask
>>> +size specified in
>>> +.I cpusetsize
>>> +is smaller than the size of the affinity mask used by the kernel.
>>
>> I think it is best to leave this as unspecified as possible. Kernel
>> behavior already changed once, and I can imagine it changing again.
>
> Hmmm. Something needs to be said about what the kernel is doing though.
> Otherwise, it's hard to make sense of this subsection. Did you have a
> suggested rewording that removes the piece you find problematic?
What about this?
“If the kernel affinity mask is larger than 1024 then
…
is smaller than the size of the affinity mask used by the kernel.
Depending on the system CPU topology, the kernel affinity mask can
be substantially larger than the number of active CPUs in the system.
”
I.e., make clear that the size of the mask can be quite different from
the CPU count.
> Handling systems with more than 1024 CPUs
> The underlying system calls (which represent CPU masks as bit
> masks of type unsigned long *) impose no restriction on the
> size of the CPU mask. However, the cpu_set_t data type used by
> glibc has a fixed size of 128 bytes, meaning that the maximum
> CPU number that can be represented is 1023. If the system has
> more than 1024 CPUs, then calls of the form:
>
> sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>
> will fail with the error EINVAL, the error produced by the
> underlying system call for the case where the mask size speci‐
> fied in cpusetsize is smaller than the size of the affinity
> mask used by the kernel.
>
> When working on systems with more than 1024 CPUs, one must
> dynamically allocate the mask argument. Currently, the only
> way to do this is by probing for the size of the required mask
> using sched_getaffinity() calls with increasing mask sizes
> (until the call does not fail with the error EINVAL).
>
> Better?
“more than 1024 CPUs” should be “large [kernel CPU] affinity masks”
throughout.
--
Florian Weimer / Red Hat Product Security
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
[not found] ` <55AFBE87.1040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-07-22 16:43 ` Michael Kerrisk (man-pages)
0 siblings, 0 replies; 8+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-07-22 16:43 UTC (permalink / raw)
To: Florian Weimer
Cc: Carlos O'Donell, Roland McGrath, KOSAKI Motohiro, libc-alpha,
linux-man
Hello Florian,
On 22 July 2015 at 18:02, Florian Weimer <fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 07/21/2015 05:03 PM, Michael Kerrisk (man-pages) wrote:
>> Hello Florian,
>>
>> Thanks for your comments, and sorry for the delayed follow-up.
>>
>> On 07/01/2015 02:37 PM, Florian Weimer wrote:
>>> On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
>>>
>>>> +.SS Handling systems with more than 1024 CPUs
>>>> +The
>>>> +.I cpu_set_t
>>>> +data type used by glibc has a fixed size of 128 bytes,
>>>> +meaning that the maximum CPU number that can be represented is 1023.
>>>> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
>>>> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
>>>> +If the system has more than 1024 CPUs, then calls of the form:
>>>> +
>>>> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>>>> +
>>>> +will fail with the error
>>>> +.BR EINVAL ,
>>>> +the error produced by the underlying system call for the case where the
>>>> +.I mask
>>>> +size specified in
>>>> +.I cpusetsize
>>>> +is smaller than the size of the affinity mask used by the kernel.
>>>
>>> I think it is best to leave this as unspecified as possible. Kernel
>>> behavior already changed once, and I can imagine it changing again.
>>
>> Hmmm. Something needs to be said about what the kernel is doing though.
>> Otherwise, it's hard to make sense of this subsection. Did you have a
>> suggested rewording that removes the piece you find problematic?
>
> What about this?
>
> “If the kernel affinity mask is larger than 1024 then
> …
> is smaller than the size of the affinity mask used by the kernel.
> Depending on the system CPU topology, the kernel affinity mask can
> be substantially larger than the number of active CPUs in the system.
> ”
Looks good. I've taken that.
> I.e., make clear that the size of the mask can be quite different from
> the CPU count.
>
>> Handling systems with more than 1024 CPUs
>> The underlying system calls (which represent CPU masks as bit
>> masks of type unsigned long *) impose no restriction on the
>> size of the CPU mask. However, the cpu_set_t data type used by
>> glibc has a fixed size of 128 bytes, meaning that the maximum
>> CPU number that can be represented is 1023. If the system has
>> more than 1024 CPUs, then calls of the form:
>>
>> sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>>
>> will fail with the error EINVAL, the error produced by the
>> underlying system call for the case where the mask size speci‐
>> fied in cpusetsize is smaller than the size of the affinity
>> mask used by the kernel.
>>
>> When working on systems with more than 1024 CPUs, one must
>> dynamically allocate the mask argument. Currently, the only
>> way to do this is by probing for the size of the required mask
>> using sched_getaffinity() calls with increasing mask sizes
>> (until the call does not fail with the error EINVAL).
>>
>> Better?
>
> “more than 1024 CPUs” should be “large [kernel CPU] affinity masks”
> throughout.
Done.
Thanks for your further input. So now we have:
C library/kernel differences
This manual page describes the glibc interface for the CPU affin‐
ity calls. The actual system call interface is slightly differ‐
ent, with the mask being typed as unsigned long *, reflecting the
fact that the underlying implementation of CPU sets is a simple
bit mask. On success, the raw sched_getaffinity() system call
returns the size (in bytes) of the cpumask_t data type that is
used internally by the kernel to represent the CPU set bit mask.
Handling systems with large CPU affinity masks
The underlying system calls (which represent CPU masks as bit
masks of type unsigned long *) impose no restriction on the size
of the CPU mask. However, the cpu_set_t data type used by glibc
has a fixed size of 128 bytes, meaning that the maximum CPU num‐
ber that can be represented is 1023. If the kernel CPU affinity
mask is larger than 1024, then calls of the form:
sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
will fail with the error EINVAL, the error produced by the under‐
lying system call for the case where the mask size specified in
cpusetsize is smaller than the size of the affinity mask used by
the kernel. (Depending on the system CPU topology, the kernel
affinity mask can be substantially larger than the number of
active CPUs in the system.)
When working on systems with large kernel CPU affinity masks, one
must dynamically allocate the mask argument. Currently, the only
way to do this is by probing for the size of the required mask
using sched_getaffinity() calls with increasing mask sizes (until
the call does not fail with the error EINVAL).
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-07-22 16:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <51E42BFE.7000301@redhat.com>
[not found] ` <51E4A0BB.2070802@gmail.com>
[not found] ` <51E4A123.9070001@gmail.com>
[not found] ` <51E6F3ED.8000502@redhat.com>
[not found] ` <51E6F956.5050902@gmail.com>
[not found] ` <51E714DE.6060802@redhat.com>
[not found] ` <CAHGf_=oZW3kNA3V-9u+BZNs3tL3JKCsO2a0Q6f0iJzo=N4Wb8w@mail.gmail.com>
[not found] ` <51E7B205.3060905@redhat.com>
[not found] ` <20130722214335.D9AFF2C06F@topped-with-meat.com>
[not found] ` <51EDB378.8070301@redhat.com>
[not found] ` <51EDB378.8070301-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-26 14:28 ` What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t? Michael Kerrisk (man-pages)
[not found] ` <558D6171.1060901-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-26 20:05 ` Michael Kerrisk (man-pages)
[not found] ` <558DB0A0.2040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-29 21:40 ` Tolga Dalman
[not found] ` <5591BB55.5080605-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
2015-07-21 15:03 ` Michael Kerrisk (man-pages)
2015-07-01 12:37 ` Florian Weimer
[not found] ` <5593DF14.2060804-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-07-21 15:03 ` Michael Kerrisk (man-pages)
[not found] ` <55AE5F33.3080105-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-22 16:02 ` Florian Weimer
[not found] ` <55AFBE87.1040006-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-07-22 16:43 ` Michael Kerrisk (man-pages)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).