From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sharyathi Nagesh Subject: Fix to numa_node_to_cpus_v2 Date: Thu, 28 Jan 2010 11:23:05 +0530 Message-ID: <4B612641.6050305@in.ibm.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040002070707080909080304" Return-path: Sender: linux-numa-owner@vger.kernel.org List-ID: To: linux-numa@vger.kernel.org, Andi Kleen , Christoph Lameter , Cliff Wickman , Lee Schermerhorn , Amit This is a multi-part message in MIME format. --------------040002070707080909080304 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi We observed that numa_node_to_cpus api() api converts a node number to a bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not long enough errno will be set to ERANGE and -1 returned. On success 0 is returned. This api has been changed in numa version 2.0. It has new implementation (_v2) Analysis: Now within the numa_node_to_cpus code there is a check if the size of buffer passed from the user matches the one returned by the sched_getaffinity. This check fails and hence we see "map size mismatch: abort" messages coming out on console. My system has 4 node and 8 CPUs. ------------------------------------------------------------------------------------ Testcase to reproduce the problem #include #include #include #include typedef unsigned long BUF[64]; int numa_exit_on_error = 0; void node_to_cpus(void) { int i; BUF cpubuf; BUF affinityCPUs; int maxnode = numa_max_node(); printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode); for (i = 0; i <= maxnode; i++) { printf("Calling numa_node_to_cpus()\n"); printf("Size of BUF is : %d \n",sizeof(BUF)); if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) { printf("Calling numa_node_to_cpus() again \n"); if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) { } else { printf("Got < 0 \n"); numa_error("numa_node_to_cpu"); numa_exit_on_error = 1; exit(numa_exit_on_error); } } else { numa_error("numa_node_to_cpu 0"); numa_exit_on_error = 1; exit(numa_exit_on_error); } } } int main() { void node_to_cpus(); if (numa_available() < 0) { printf("This system does not support NUMA policy\n"); numa_error("numa_available"); numa_exit_on_error = 1; exit(numa_exit_on_error); } node_to_cpus(); return numa_exit_on_error; } ------------------------------------------------------------------------------------ Problem Fix: The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied buffer is smaller than the bitmask required to represent online NUMA nodes. Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1 Regards Yeehaw --------------040002070707080909080304 Content-Type: text/x-patch; name="fix_numa_node_to_cpus_v2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="fix_numa_node_to_cpus_v2.patch" Index: numactl-2.0.4-rc1/libnuma.c =================================================================== --- numactl-2.0.4-rc1.orig/libnuma.c 2009-12-16 02:48:26.000000000 +0530 +++ numactl-2.0.4-rc1/libnuma.c 2010-01-27 17:06:30.000000000 +0530 @@ -1272,7 +1272,7 @@ if (node_cpu_mask_v2[node]) { /* have already constructed a mask for this node */ - if (buffer->size != node_cpu_mask_v2[node]->size) { + if (buffer->size < node_cpu_mask_v2[node]->size) { numa_error("map size mismatch; abort\n"); return -1; } --------------040002070707080909080304--