From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752401AbaGGBUs (ORCPT ); Sun, 6 Jul 2014 21:20:48 -0400 Received: from fgwmail6.fujitsu.co.jp ([192.51.44.36]:54092 "EHLO fgwmail6.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752219AbaGGBUr (ORCPT ); Sun, 6 Jul 2014 21:20:47 -0400 X-SecurityPolicyCheck: OK by SHieldMailChecker v2.0.1 X-SHieldMailCheckerPolicyVersion: FJ-ISEC-20120718-3 Message-ID: <53B9F5CB.8080703@jp.fujitsu.com> Date: Mon, 7 Jul 2014 10:20:11 +0900 From: Yasuaki Ishimatsu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Lai Jiangshan CC: , Subject: Re: [PATCH] workqueue: initialize cpumask of wq_numa_possible_cpumask References: <53B98591.5030000@jp.fujitsu.com> <53B9E778.9070807@cn.fujitsu.com> <53B9EAC8.6080608@jp.fujitsu.com> <53B9F216.1020307@cn.fujitsu.com> In-Reply-To: <53B9F216.1020307@cn.fujitsu.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-SecurityPolicyCheck-GC: OK by FENCE-Mail Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2014/07/07 10:04), Lai Jiangshan wrote: > On 07/07/2014 08:33 AM, Yasuaki Ishimatsu wrote: >> (2014/07/07 9:19), Lai Jiangshan wrote: >>> On 07/07/2014 01:21 AM, Yasuaki Ishimatsu wrote: >>>> When hot-adding and onlining CPU, kernel panic occurs, showing following >>>> call trace. >>>> >>>> BUG: unable to handle kernel paging request at 0000000000001d08 >>>> IP: [] __alloc_pages_nodemask+0x9d/0xb10 >>>> PGD 0 >>>> Oops: 0000 [#1] SMP >>>> ... >>>> Call Trace: >>>> [] ? cpumask_next_and+0x35/0x50 >>>> [] ? find_busiest_group+0x113/0x8f0 >>>> [] ? deactivate_slab+0x349/0x3c0 >>>> [] new_slab+0x91/0x300 >>>> [] __slab_alloc+0x2bb/0x482 >>>> [] ? copy_process.part.25+0xfc/0x14c0 >>>> [] ? load_balance+0x218/0x890 >>>> [] ? sched_clock+0x9/0x10 >>>> [] ? trace_clock_local+0x9/0x10 >>>> [] kmem_cache_alloc_node+0x8c/0x200 >>>> [] copy_process.part.25+0xfc/0x14c0 >>>> [] ? trace_buffer_unlock_commit+0x4d/0x60 >>>> [] ? kthread_create_on_node+0x140/0x140 >>>> [] do_fork+0xbc/0x360 >>>> [] kernel_thread+0x26/0x30 >>>> [] kthreadd+0x2c2/0x300 >>>> [] ? kthread_create_on_cpu+0x60/0x60 >>>> [] ret_from_fork+0x7c/0xb0 >>>> [] ? kthread_create_on_cpu+0x60/0x60 >>>> >>>> In my investigation, I found the root cause is wq_numa_possible_cpumask. >>>> All entries of wq_numa_possible_cpumask is allocated by >>>> alloc_cpumask_var_node(). And these entries are used without initializing. >>>> So these entries have wrong value. >>>> >>>> When hot-adding and onlining CPU, wq_update_unbound_numa() is called. >>>> wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq() >>>> calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set >>>> as follow: >>>> >>>> #kernel/workqueue.c >>>> 3592 /* if cpumask is contained inside a NUMA node, we belong to that node */ >>>> 3593 if (wq_numa_enabled) { >>>> 3594 for_each_node(node) { >>>> 3595 if (cpumask_subset(pool->attrs->cpumask, >>>> 3596 wq_numa_possible_cpumask[node])) { >>>> 3597 pool->node = node; >>>> 3598 break; >>>> 3599 } >>>> 3600 } >>>> 3601 } >>>> >>>> But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong >>>> node is selected. As a result, kernel panic occurs. >>>> >>>> By this patch, all entries of wq_numa_possible_cpumask are allocated by >>>> zalloc_cpumask_var_node to initialize them. And the panic disappeared. > > Hi, Yasuaki > > You said the panic disappeared with the old patch, how did it happen > since the old patch was considered incorrect? > > Did the panic happen so rarely that it was mistaken disappeared? > > How did you test the new one? I tested new patch. And I confirmed the panic disappeared. The patch is one liner. So after testing correct patch (new one), I wrote a patch into my tree by hand not use git format-patch/am command. Then I mistook to create old patch and send it. So I tested new patch and didn't test old patch. Sorry for my mistake. > > In the point of review, we definitely need to use zalloc_cpumask_var_node() > instead of alloc_cpumask_var_node() in wq_numa_init(). > > So for the new patch: > > Reviewed-by: Lai Jiangshan Thanks for your review. Thanks, Yasuaki Ishimatsu > > Thanks, > Lai > >>>> >>>> Signed-off-by: Yasuaki Ishimatsu >>> >>> Hi, Yasuaki >>> >>> All cpumasks in the wq_numa_possible_cpumask array are allocated in >>> wq_numa_init(): >>> >>> for_each_node(node) >>> BUG_ON(!alloc_cpumask_var_node(&tbl[node], GFP_KERNEL, >>> node_online(node) ? node : NUMA_NO_NODE)); >>> >>> [snip...] >>> >>> wq_numa_possible_cpumask = tbl; >>> >>> I didn't find out how does this patch make the all entries of >>> wq_numa_possible_cpumask zeroed. >> >> Sorry. I mistook. I will resend soon. >> >> Thanks, >> Yasuaki Ishimatsu. >> >>> >>> Or I misunderstood. >>> >>> Thanks, >>> Lai >>> >>>> --- >>>> kernel/workqueue.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c >>>> index 6203d29..b393ded 100644 >>>> --- a/kernel/workqueue.c >>>> +++ b/kernel/workqueue.c >>>> @@ -3338,7 +3338,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(gfp_t gfp_mask) >>>> attrs = kzalloc(sizeof(*attrs), gfp_mask); >>>> if (!attrs) >>>> goto fail; >>>> - if (!alloc_cpumask_var(&attrs->cpumask, gfp_mask)) >>>> + if (!zalloc_cpumask_var(&attrs->cpumask, gfp_mask)) >>>> goto fail; >>>> >>>> cpumask_copy(attrs->cpumask, cpu_possible_mask); >>>> >>>> . >>>> >>> >> >> >> . >> >