From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cliff Wickman Subject: Re: Fix to numa_node_to_cpus_v2 Date: Tue, 2 Feb 2010 07:40:59 -0600 Message-ID: <20100202134059.GA2508@sgi.com> References: <4B67B539.6030708@in.ibm.com> Mime-Version: 1.0 Return-path: Content-Disposition: inline In-Reply-To: <4B67B539.6030708@in.ibm.com> Sender: linux-numa-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Sharyathi Nagesh Cc: linux-numa@vger.kernel.org, Amit K Arora , deepti.kalra@in.ibm.com Hi Sharyathi On Tue, Feb 02, 2010 at 10:46:41AM +0530, Sharyathi Nagesh wrote: > Cliff > Thank you for providing the correction. It looks like the best thing to > do with changed buffer size context after adding earlier patch that I > sent. > I had one observation, though it doesn't impact this issue directly. In > function copy_bitmask_to_bitmask() 3rd condition looked redundant to me. > Since first 2 conditions cover all the cases, in that situation would > these conditions make sense ? > ----------------------------------------------------------------------------------------------- > else { > bytes = CPU_BYTES(bmpfrom->size); > memcpy(bmpto->maskp, bmpfrom->maskp, bytes); > } > ----------------------------------------------------------------------------------------------- Indeed extraneous. I removed them. Thanks. > Do let us know when can we expect these patches upstream. > Thank you > Sharyathi Your patch is included in numactl-2.0.4-rc2.tar.gz (at ftp://oss.sgi.com/www/projects/libnuma/download/ ) -Cliff > > On 02/02/2010 04:07 AM, Cliff Wickman wrote: >> Hi Sharyathi, >> >> Thanks for both patch and test case. >> >> The patch needs one more change I think. >> The target buffer may be bigger, so the copy of the map needs >> to be zero-extended. >> Would you review it? >> >> Thx. >> -Cliff >> >>> Date: Thu, 28 Jan 2010 11:23:05 +0530 >>> From: Sharyathi Nagesh >>> To: linux-numa@vger.kernel.org, Andi Kleen, >>> Christoph Lameter, Cliff Wickman, >>> Lee Schermerhorn, >>> Amit K Arora, deepti.kalra@in.ibm.com >>> Subject: Fix to numa_node_to_cpus_v2 >>> >>> Hi >>> >>> We observed that numa_node_to_cpus api() api converts a node number to a >>> bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not >>> long enough errno will be set to ERANGE and -1 returned. On success 0 is returned. >>> This api has been changed in numa version 2.0. It has new implementation (_v2) >>> >>> Analysis: >>> Now within the numa_node_to_cpus code there is a check if the size of buffer >>> passed from the user matches the one returned by the sched_getaffinity. This >>> check fails and hence we see "map size mismatch: abort" messages coming out on >>> console. My system has 4 node and 8 CPUs. >>> >>> Testcase to reproduce the problem: >>> #include >>> #include >>> #include >>> #include >>> >>> typedef unsigned long BUF[64]; >>> >>> int numa_exit_on_error = 0; >>> >>> void node_to_cpus(void) >>> { >>> int i; >>> BUF cpubuf; >>> BUF affinityCPUs; >>> int maxnode = numa_max_node(); >>> printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode); >>> for (i = 0; i<= maxnode; i++) { >>> printf("Calling numa_node_to_cpus()\n"); >>> printf("Size of BUF is : %d \n",sizeof(BUF)); >>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) { >>> printf("Calling numa_node_to_cpus() again \n"); >>> if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) { >>> } else { >>> printf("Got< 0 \n"); >>> numa_error("numa_node_to_cpu"); >>> numa_exit_on_error = 1; >>> exit(numa_exit_on_error); >>> } >>> } else { >>> numa_error("numa_node_to_cpu 0"); >>> numa_exit_on_error = 1; >>> exit(numa_exit_on_error); >>> } >>> } >>> } >>> int main() >>> { >>> void node_to_cpus(); >>> if (numa_available()< 0) >>> { >>> printf("This system does not support NUMA policy\n"); >>> numa_error("numa_available"); >>> numa_exit_on_error = 1; >>> exit(numa_exit_on_error); >>> } >>> node_to_cpus(); >>> return numa_exit_on_error; >>> } >>> -------------------------------------------------------------------------------- >>> >>> Problem Fix: >>> The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied >>> buffer is smaller than the bitmask required to represent online NUMA nodes. >>> Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1 >>> >>> Regards >>> Yeehaw >>> >> --- >> libnuma.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> Index: numactl-dev/libnuma.c >> =================================================================== >> --- numactl-dev.orig/libnuma.c >> +++ numactl-dev/libnuma.c >> @@ -1272,11 +1272,11 @@ numa_node_to_cpus_v2(int node, struct bi >> >> if (node_cpu_mask_v2[node]) { >> /* have already constructed a mask for this node */ >> - if (buffer->size != node_cpu_mask_v2[node]->size) { >> + if (buffer->size< node_cpu_mask_v2[node]->size) { >> numa_error("map size mismatch; abort\n"); >> return -1; >> } >> - memcpy(buffer->maskp, node_cpu_mask_v2[node]->maskp, bufferlen); >> + copy_bitmask_to_bitmask(node_cpu_mask_v2[node], buffer); >> return 0; >> } >> -- Cliff Wickman SGI cpw@sgi.com (651) 683-3824