* Fix to numa_node_to_cpus_v2
@ 2010-01-28 5:53 Sharyathi Nagesh
0 siblings, 0 replies; 4+ messages in thread
From: Sharyathi Nagesh @ 2010-01-28 5:53 UTC (permalink / raw)
To: linux-numa, Andi Kleen, Christoph Lameter, Cliff Wickman,
Lee Schermerhorn, Amit
[-- Attachment #1: Type: text/plain, Size: 2681 bytes --]
Hi
We observed that numa_node_to_cpus api() api converts a node number to a bitmask of
CPUs. The user must pass a long enough buffer. If the buffer is not long enough
errno will be set to ERANGE and -1 returned. On success 0 is returned.
This api has been changed in numa version 2.0. It has new implementation (_v2)
Analysis:
Now within the numa_node_to_cpus code there is a check if the size of buffer
passed from the user matches the one returned by the sched_getaffinity. This
check fails and hence we see "map size mismatch: abort" messages coming out on
console. My system has 4 node and 8 CPUs.
------------------------------------------------------------------------------------
Testcase to reproduce the problem
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <numa.h>
typedef unsigned long BUF[64];
int numa_exit_on_error = 0;
void node_to_cpus(void)
{
int i;
BUF cpubuf;
BUF affinityCPUs;
int maxnode = numa_max_node();
printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
for (i = 0; i <= maxnode; i++) {
printf("Calling numa_node_to_cpus()\n");
printf("Size of BUF is : %d \n",sizeof(BUF));
if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
printf("Calling numa_node_to_cpus() again \n");
if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
} else {
printf("Got < 0 \n");
numa_error("numa_node_to_cpu");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
} else {
numa_error("numa_node_to_cpu 0");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
}
}
int main()
{
void node_to_cpus();
if (numa_available() < 0)
{
printf("This system does not support NUMA policy\n");
numa_error("numa_available");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
node_to_cpus();
return numa_exit_on_error;
}
------------------------------------------------------------------------------------
Problem Fix:
The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied buffer is
smaller than the bitmask required to represent online NUMA nodes.
Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1
Regards
Yeehaw
[-- Attachment #2: fix_numa_node_to_cpus_v2.patch --]
[-- Type: text/x-patch, Size: 530 bytes --]
Index: numactl-2.0.4-rc1/libnuma.c
===================================================================
--- numactl-2.0.4-rc1.orig/libnuma.c 2009-12-16 02:48:26.000000000 +0530
+++ numactl-2.0.4-rc1/libnuma.c 2010-01-27 17:06:30.000000000 +0530
@@ -1272,7 +1272,7 @@
if (node_cpu_mask_v2[node]) {
/* have already constructed a mask for this node */
- if (buffer->size != node_cpu_mask_v2[node]->size) {
+ if (buffer->size < node_cpu_mask_v2[node]->size) {
numa_error("map size mismatch; abort\n");
return -1;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Fix to numa_node_to_cpus_v2
@ 2010-02-01 22:37 Cliff Wickman
2010-02-02 5:16 ` Sharyathi Nagesh
0 siblings, 1 reply; 4+ messages in thread
From: Cliff Wickman @ 2010-02-01 22:37 UTC (permalink / raw)
To: linux-numa; +Cc: sharyath
Hi Sharyathi,
Thanks for both patch and test case.
The patch needs one more change I think.
The target buffer may be bigger, so the copy of the map needs
to be zero-extended.
Would you review it?
Thx.
-Cliff
> Date: Thu, 28 Jan 2010 11:23:05 +0530
> From: Sharyathi Nagesh <sharyath@in.ibm.com>
> To: linux-numa@vger.kernel.org, Andi Kleen <andi@firstfloor.org>,
> Christoph Lameter <clameter@sgi.com>, Cliff Wickman <cpw@sgi.com>,
> Lee Schermerhorn <lee.schermerhorn@hp.com>,
> Amit K Arora <amitarora@in.ibm.com>, deepti.kalra@in.ibm.com
> Subject: Fix to numa_node_to_cpus_v2
>
> Hi
>
> We observed that numa_node_to_cpus api() api converts a node number to a
> bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not
> long enough errno will be set to ERANGE and -1 returned. On success 0 is returned.
> This api has been changed in numa version 2.0. It has new implementation (_v2)
>
> Analysis:
> Now within the numa_node_to_cpus code there is a check if the size of buffer
> passed from the user matches the one returned by the sched_getaffinity. This
> check fails and hence we see "map size mismatch: abort" messages coming out on
> console. My system has 4 node and 8 CPUs.
>
> Testcase to reproduce the problem:
> #include <errno.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <numa.h>
>
> typedef unsigned long BUF[64];
>
> int numa_exit_on_error = 0;
>
> void node_to_cpus(void)
> {
> int i;
> BUF cpubuf;
> BUF affinityCPUs;
> int maxnode = numa_max_node();
> printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
> for (i = 0; i <= maxnode; i++) {
> printf("Calling numa_node_to_cpus()\n");
> printf("Size of BUF is : %d \n",sizeof(BUF));
> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
> printf("Calling numa_node_to_cpus() again \n");
> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
> } else {
> printf("Got < 0 \n");
> numa_error("numa_node_to_cpu");
> numa_exit_on_error = 1;
> exit(numa_exit_on_error);
> }
> } else {
> numa_error("numa_node_to_cpu 0");
> numa_exit_on_error = 1;
> exit(numa_exit_on_error);
> }
> }
> }
> int main()
> {
> void node_to_cpus();
> if (numa_available() < 0)
> {
> printf("This system does not support NUMA policy\n");
> numa_error("numa_available");
> numa_exit_on_error = 1;
> exit(numa_exit_on_error);
> }
> node_to_cpus();
> return numa_exit_on_error;
> }
> --------------------------------------------------------------------------------
>
> Problem Fix:
> The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied
> buffer is smaller than the bitmask required to represent online NUMA nodes.
> Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1
>
> Regards
> Yeehaw
>
---
libnuma.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: numactl-dev/libnuma.c
===================================================================
--- numactl-dev.orig/libnuma.c
+++ numactl-dev/libnuma.c
@@ -1272,11 +1272,11 @@ numa_node_to_cpus_v2(int node, struct bi
if (node_cpu_mask_v2[node]) {
/* have already constructed a mask for this node */
- if (buffer->size != node_cpu_mask_v2[node]->size) {
+ if (buffer->size < node_cpu_mask_v2[node]->size) {
numa_error("map size mismatch; abort\n");
return -1;
}
- memcpy(buffer->maskp, node_cpu_mask_v2[node]->maskp, bufferlen);
+ copy_bitmask_to_bitmask(node_cpu_mask_v2[node], buffer);
return 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Fix to numa_node_to_cpus_v2
2010-02-01 22:37 Fix to numa_node_to_cpus_v2 Cliff Wickman
@ 2010-02-02 5:16 ` Sharyathi Nagesh
2010-02-02 13:40 ` Cliff Wickman
0 siblings, 1 reply; 4+ messages in thread
From: Sharyathi Nagesh @ 2010-02-02 5:16 UTC (permalink / raw)
To: Cliff Wickman; +Cc: linux-numa, Amit K Arora, deepti.kalra
Cliff
Thank you for providing the correction. It looks like the best thing to do with changed
buffer size context after adding earlier patch that I sent.
I had one observation, though it doesn't impact this issue directly. In function
copy_bitmask_to_bitmask() 3rd condition looked redundant to me. Since first 2 conditions
cover all the cases, in that situation would these conditions make sense ?
-----------------------------------------------------------------------------------------------
else {
bytes = CPU_BYTES(bmpfrom->size);
memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
}
-----------------------------------------------------------------------------------------------
Do let us know when can we expect these patches upstream.
Thank you
Sharyathi
On 02/02/2010 04:07 AM, Cliff Wickman wrote:
> Hi Sharyathi,
>
> Thanks for both patch and test case.
>
> The patch needs one more change I think.
> The target buffer may be bigger, so the copy of the map needs
> to be zero-extended.
> Would you review it?
>
> Thx.
> -Cliff
>
>> Date: Thu, 28 Jan 2010 11:23:05 +0530
>> From: Sharyathi Nagesh<sharyath@in.ibm.com>
>> To: linux-numa@vger.kernel.org, Andi Kleen<andi@firstfloor.org>,
>> Christoph Lameter<clameter@sgi.com>, Cliff Wickman<cpw@sgi.com>,
>> Lee Schermerhorn<lee.schermerhorn@hp.com>,
>> Amit K Arora<amitarora@in.ibm.com>, deepti.kalra@in.ibm.com
>> Subject: Fix to numa_node_to_cpus_v2
>>
>> Hi
>>
>> We observed that numa_node_to_cpus api() api converts a node number to a
>> bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not
>> long enough errno will be set to ERANGE and -1 returned. On success 0 is returned.
>> This api has been changed in numa version 2.0. It has new implementation (_v2)
>>
>> Analysis:
>> Now within the numa_node_to_cpus code there is a check if the size of buffer
>> passed from the user matches the one returned by the sched_getaffinity. This
>> check fails and hence we see "map size mismatch: abort" messages coming out on
>> console. My system has 4 node and 8 CPUs.
>>
>> Testcase to reproduce the problem:
>> #include<errno.h>
>> #include<stdio.h>
>> #include<stdlib.h>
>> #include<numa.h>
>>
>> typedef unsigned long BUF[64];
>>
>> int numa_exit_on_error = 0;
>>
>> void node_to_cpus(void)
>> {
>> int i;
>> BUF cpubuf;
>> BUF affinityCPUs;
>> int maxnode = numa_max_node();
>> printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
>> for (i = 0; i<= maxnode; i++) {
>> printf("Calling numa_node_to_cpus()\n");
>> printf("Size of BUF is : %d \n",sizeof(BUF));
>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
>> printf("Calling numa_node_to_cpus() again \n");
>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
>> } else {
>> printf("Got< 0 \n");
>> numa_error("numa_node_to_cpu");
>> numa_exit_on_error = 1;
>> exit(numa_exit_on_error);
>> }
>> } else {
>> numa_error("numa_node_to_cpu 0");
>> numa_exit_on_error = 1;
>> exit(numa_exit_on_error);
>> }
>> }
>> }
>> int main()
>> {
>> void node_to_cpus();
>> if (numa_available()< 0)
>> {
>> printf("This system does not support NUMA policy\n");
>> numa_error("numa_available");
>> numa_exit_on_error = 1;
>> exit(numa_exit_on_error);
>> }
>> node_to_cpus();
>> return numa_exit_on_error;
>> }
>> --------------------------------------------------------------------------------
>>
>> Problem Fix:
>> The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied
>> buffer is smaller than the bitmask required to represent online NUMA nodes.
>> Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1
>>
>> Regards
>> Yeehaw
>>
> ---
> libnuma.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: numactl-dev/libnuma.c
> ===================================================================
> --- numactl-dev.orig/libnuma.c
> +++ numactl-dev/libnuma.c
> @@ -1272,11 +1272,11 @@ numa_node_to_cpus_v2(int node, struct bi
>
> if (node_cpu_mask_v2[node]) {
> /* have already constructed a mask for this node */
> - if (buffer->size != node_cpu_mask_v2[node]->size) {
> + if (buffer->size< node_cpu_mask_v2[node]->size) {
> numa_error("map size mismatch; abort\n");
> return -1;
> }
> - memcpy(buffer->maskp, node_cpu_mask_v2[node]->maskp, bufferlen);
> + copy_bitmask_to_bitmask(node_cpu_mask_v2[node], buffer);
> return 0;
> }
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Fix to numa_node_to_cpus_v2
2010-02-02 5:16 ` Sharyathi Nagesh
@ 2010-02-02 13:40 ` Cliff Wickman
0 siblings, 0 replies; 4+ messages in thread
From: Cliff Wickman @ 2010-02-02 13:40 UTC (permalink / raw)
To: Sharyathi Nagesh; +Cc: linux-numa, Amit K Arora, deepti.kalra
Hi Sharyathi
On Tue, Feb 02, 2010 at 10:46:41AM +0530, Sharyathi Nagesh wrote:
> Cliff
> Thank you for providing the correction. It looks like the best thing to
> do with changed buffer size context after adding earlier patch that I
> sent.
> I had one observation, though it doesn't impact this issue directly. In
> function copy_bitmask_to_bitmask() 3rd condition looked redundant to me.
> Since first 2 conditions cover all the cases, in that situation would
> these conditions make sense ?
> -----------------------------------------------------------------------------------------------
> else {
> bytes = CPU_BYTES(bmpfrom->size);
> memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
> }
> -----------------------------------------------------------------------------------------------
Indeed extraneous. I removed them. Thanks.
> Do let us know when can we expect these patches upstream.
> Thank you
> Sharyathi
Your patch is included in numactl-2.0.4-rc2.tar.gz
(at ftp://oss.sgi.com/www/projects/libnuma/download/ )
-Cliff
>
> On 02/02/2010 04:07 AM, Cliff Wickman wrote:
>> Hi Sharyathi,
>>
>> Thanks for both patch and test case.
>>
>> The patch needs one more change I think.
>> The target buffer may be bigger, so the copy of the map needs
>> to be zero-extended.
>> Would you review it?
>>
>> Thx.
>> -Cliff
>>
>>> Date: Thu, 28 Jan 2010 11:23:05 +0530
>>> From: Sharyathi Nagesh<sharyath@in.ibm.com>
>>> To: linux-numa@vger.kernel.org, Andi Kleen<andi@firstfloor.org>,
>>> Christoph Lameter<clameter@sgi.com>, Cliff Wickman<cpw@sgi.com>,
>>> Lee Schermerhorn<lee.schermerhorn@hp.com>,
>>> Amit K Arora<amitarora@in.ibm.com>, deepti.kalra@in.ibm.com
>>> Subject: Fix to numa_node_to_cpus_v2
>>>
>>> Hi
>>>
>>> We observed that numa_node_to_cpus api() api converts a node number to a
>>> bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not
>>> long enough errno will be set to ERANGE and -1 returned. On success 0 is returned.
>>> This api has been changed in numa version 2.0. It has new implementation (_v2)
>>>
>>> Analysis:
>>> Now within the numa_node_to_cpus code there is a check if the size of buffer
>>> passed from the user matches the one returned by the sched_getaffinity. This
>>> check fails and hence we see "map size mismatch: abort" messages coming out on
>>> console. My system has 4 node and 8 CPUs.
>>>
>>> Testcase to reproduce the problem:
>>> #include<errno.h>
>>> #include<stdio.h>
>>> #include<stdlib.h>
>>> #include<numa.h>
>>>
>>> typedef unsigned long BUF[64];
>>>
>>> int numa_exit_on_error = 0;
>>>
>>> void node_to_cpus(void)
>>> {
>>> int i;
>>> BUF cpubuf;
>>> BUF affinityCPUs;
>>> int maxnode = numa_max_node();
>>> printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
>>> for (i = 0; i<= maxnode; i++) {
>>> printf("Calling numa_node_to_cpus()\n");
>>> printf("Size of BUF is : %d \n",sizeof(BUF));
>>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
>>> printf("Calling numa_node_to_cpus() again \n");
>>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
>>> } else {
>>> printf("Got< 0 \n");
>>> numa_error("numa_node_to_cpu");
>>> numa_exit_on_error = 1;
>>> exit(numa_exit_on_error);
>>> }
>>> } else {
>>> numa_error("numa_node_to_cpu 0");
>>> numa_exit_on_error = 1;
>>> exit(numa_exit_on_error);
>>> }
>>> }
>>> }
>>> int main()
>>> {
>>> void node_to_cpus();
>>> if (numa_available()< 0)
>>> {
>>> printf("This system does not support NUMA policy\n");
>>> numa_error("numa_available");
>>> numa_exit_on_error = 1;
>>> exit(numa_exit_on_error);
>>> }
>>> node_to_cpus();
>>> return numa_exit_on_error;
>>> }
>>> --------------------------------------------------------------------------------
>>>
>>> Problem Fix:
>>> The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied
>>> buffer is smaller than the bitmask required to represent online NUMA nodes.
>>> Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1
>>>
>>> Regards
>>> Yeehaw
>>>
>> ---
>> libnuma.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> Index: numactl-dev/libnuma.c
>> ===================================================================
>> --- numactl-dev.orig/libnuma.c
>> +++ numactl-dev/libnuma.c
>> @@ -1272,11 +1272,11 @@ numa_node_to_cpus_v2(int node, struct bi
>>
>> if (node_cpu_mask_v2[node]) {
>> /* have already constructed a mask for this node */
>> - if (buffer->size != node_cpu_mask_v2[node]->size) {
>> + if (buffer->size< node_cpu_mask_v2[node]->size) {
>> numa_error("map size mismatch; abort\n");
>> return -1;
>> }
>> - memcpy(buffer->maskp, node_cpu_mask_v2[node]->maskp, bufferlen);
>> + copy_bitmask_to_bitmask(node_cpu_mask_v2[node], buffer);
>> return 0;
>> }
>>
--
Cliff Wickman
SGI
cpw@sgi.com
(651) 683-3824
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-02-02 13:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-01 22:37 Fix to numa_node_to_cpus_v2 Cliff Wickman
2010-02-02 5:16 ` Sharyathi Nagesh
2010-02-02 13:40 ` Cliff Wickman
-- strict thread matches above, loose matches on Subject: below --
2010-01-28 5:53 Sharyathi Nagesh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).