Re: [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: "\"Izumi, Taku/泉 拓\"" <izumi.taku@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Tejun Heo" <tj@kernel.org>,
	"\"Ishimatsu, Yasuaki/石松 靖章\"" <isimatu.yasuaki@jp.fujitsu.com>,
	"\"Gu, Zheng/顾 政\"" <guz.fnst@cn.fujitsu.com>,
	"\"Tang, Chen/汤 晨\"" <tangchen@cn.fujitsu.com>,
	"\"Kamezawa, Hiroyuki/亀澤 寛之\"" <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug
Date: Fri, 23 Jan 2015 16:18:15 +0800	[thread overview]
Message-ID: <54C203C7.4080009@cn.fujitsu.com> (raw)
In-Reply-To: <E86EADE93E2D054CBCD4E708C38D364A54134B14@G01JPEXMBYT01>

On 01/23/2015 02:13 PM, Izumi, Taku/泉 拓 wrote:
> 
>> This patches are un-changloged, un-compiled, un-booted, un-tested,
>> they are just shits, I even hope them un-sent or blocked.
>>
>> The patches include two -solutions-:
>>
>> Shit_A:
>>   workqueue: reset pool->node and unhash the pool when the node is
>>     offline
>>   update wq_numa when cpu_present_mask changed
>>
>>  kernel/workqueue.c | 107 +++++++++++++++++++++++++++++++++++++++++------------
>>  1 file changed, 84 insertions(+), 23 deletions(-)
>>
>>
>> Shit_B:
>>   workqueue: reset pool->node and unhash the pool when the node is
>>     offline
>>   workqueue: remove wq_numa_possible_cpumask
>>   workqueue: directly update attrs of pools when cpu hot[un]plug
>>
>>  kernel/workqueue.c | 135 +++++++++++++++++++++++++++++++++++++++--------------
>>  1 file changed, 101 insertions(+), 34 deletions(-)
>>
> 
>   I tried your patchsets.
>   linux-3.18.3 + Shit_A:
> 
>     Build OK. 
>     I tried to reproduce the problem that Ishimatsu had reported, but it doesn't occur.
>     It seems that your patch fixes this problem.
> 
>   linux-3.18.3  + Shit_B: 
> 
>     Build OK, but I encountered kernel panic at boot time.

pool->unbound_pwqs was forgotten to be initialized.

Even though, I prefer to this solution_B.

> 
> [    0.189000] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [    0.189000] IP: [<ffffffff8131ef96>] __list_add+0x16/0xc0
> [    0.189000] PGD 0 
> [    0.189000] Oops: 0000 [#1] SMP 
> [    0.189000] Modules linked in:
> [    0.189000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.3+ #3
> [    0.189000] Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 01.81 12/03/2014
> [    0.189000] task: ffff880869678000 ti: ffff880869664000 task.ti: ffff880869664000
> [    0.189000] RIP: 0010:[<ffffffff8131ef96>]  [<ffffffff8131ef96>] __list_add+0x16/0xc0
> [    0.189000] RSP: 0000:ffff880869667be8  EFLAGS: 00010296
> [    0.189000] RAX: ffff88087f83cda8 RBX: ffff88087f83cd80 RCX: 0000000000000000
> [    0.189000] RDX: 0000000000000000 RSI: ffff88086912bb98 RDI: ffff88087f83cd80
> [    0.189000] RBP: ffff880869667c08 R08: 0000000000000000 R09: ffff88087f807480
> [    0.189000] R10: ffffffff810911b6 R11: ffffffff810956ac R12: 0000000000000000
> [    0.189000] R13: ffff88086912bb98 R14: 0000000000000400 R15: 0000000000000400
> [    0.189000] FS:  0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000
> [    0.189000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.189000] CR2: 0000000000000008 CR3: 0000000001998000 CR4: 00000000001407f0
> [    0.189000] Stack:
> [    0.189000]  000000000000000a ffff88086912b800 ffff88087f83cd00 ffff88087f80c000
> [    0.189000]  ffff880869667c48 ffffffff810912c8 ffff880869667c28 ffff88087f803f00
> [    0.189000]  00000000fffffff4 ffff88086964b760 ffff88086964b6a0 ffff88086964b740
> [    0.189000] Call Trace:
> [    0.189000]  [<ffffffff810912c8>] alloc_unbound_pwq+0x298/0x3b0
> [    0.189000]  [<ffffffff81091ce8>] apply_workqueue_attrs+0x158/0x4c0
> [    0.189000]  [<ffffffff81092424>] __alloc_workqueue_key+0x174/0x5b0
> [    0.189000]  [<ffffffff813052a6>] ? alloc_cpumask_var_node+0x56/0x80
> [    0.189000]  [<ffffffff81b21573>] init_workqueues+0x33d/0x40f
> [    0.189000]  [<ffffffff81b21236>] ? ftrace_define_fields_workqueue_execute_start+0x6a/0x6a
> [    0.189000]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
> [    0.189000]  [<ffffffff81b12f4d>] ? native_smp_prepare_cpus+0x34d/0x352
> [    0.189000]  [<ffffffff81b0026d>] kernel_init_freeable+0xf5/0x23c
> [    0.189000]  [<ffffffff81653370>] ? rest_init+0x80/0x80
> [    0.189000]  [<ffffffff8165337e>] kernel_init+0xe/0xf0
> [    0.189000]  [<ffffffff8166bcfc>] ret_from_fork+0x7c/0xb0
> [    0.189000]  [<ffffffff81653370>] ? rest_init+0x80/0x80
> [    0.189000] Code: ff b8 f4 ff ff ff e9 3b ff ff ff b8 f4 ff ff ff e9 31 ff ff ff 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 48 89 fb 48 83 ec 08 <4c> 8b 42 08 49 39 f0 75 2e 4d 8b 45 00 4d 39 c4 75 6c 4c 39 e3 
> [    0.189000] RIP  [<ffffffff8131ef96>] __list_add+0x16/0xc0
> [    0.189000]  RSP <ffff880869667be8>
> [    0.189000] CR2: 0000000000000008
> [    0.189000] ---[ end trace 58feee6875cf67cf ]---
> [    0.189000] Kernel panic - not syncing: Fatal exception
> [    0.189000] ---[ end Kernel panic - not syncing: Fatal exception
> 
>    
>   Sincerely,
>   Taku Izumi
> 
> 
>> Both patch1 of the both solutions are: reset pool->node and unhash the pool,
>> it is suggested by TJ, I found it is a good leading-step for fixing the bug.
>>
>> The other patches are handling wq_numa_possible_cpumask where the solutions
>> diverge.
>>
>> Solution_A uses present_mask rather than possible_cpumask. It adds
>> wq_numa_notify_cpu_present_set/cleared() for notifications of
>> the changes of cpu_present_mask.  But the notifications are un-existed
>> right now, so I fake one (wq_numa_check_present_cpumask_changes())
>> to imitate them.  I hope the memory people add a real one.
>>
>> Solution_B uses online_mask rather than possible_cpumask.
>> this solution remove more coupling between numa_code and workqueue,
>> it just depends on cpumask_of_node(node).
>>
>> Patch2_of_Solution_B removes the wq_numa_possible_cpumask and add
>> overhead when cpu hot[un]plug, Patch3 reduce this overhead.
>>
>> Thanks,
>> Lai
>>
>>
>> Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>> Cc: "Gu, Zheng" <guz.fnst@cn.fujitsu.com>
>> Cc: tangchen <tangchen@cn.fujitsu.com>
>> Cc: Hiroyuki KAMEZAWA <kamezawa.hiroyu@jp.fujitsu.com>
>> --
>> 2.1.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

next prev parent reply	other threads:[~2015-01-23  8:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-16 16:36 [PATCH 0/2] workqueue: fix a bug when numa mapping is changed v4 Kamezawa Hiroyuki
2014-12-16 16:45 ` [PATCH 1/2] workqueue: update numa affinity info at node hotplug Kamezawa Hiroyuki
2014-12-17  1:36   ` Lai Jiangshan
2014-12-17  3:22     ` Kamezawa Hiroyuki
2014-12-17  4:56       ` Kamezawa Hiroyuki
2014-12-25 20:11         ` Tejun Heo
2015-01-13  7:19           ` Lai Jiangshan
2015-01-13 15:22             ` Tejun Heo
2015-01-14  2:47               ` Lai Jiangshan
2015-01-14  8:54                 ` [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug Lai Jiangshan
2015-01-14  8:54                   ` [RFC PATCH 1/2 shit_A shit_B] workqueue: reset pool->node and unhash the pool when the node is offline Lai Jiangshan
2015-01-14  8:54                   ` [RFC PATCH 2/2 shit_A] workqueue: update wq_numa when cpu_present_mask changed Lai Jiangshan
2015-01-14  8:54                   ` [RFC PATCH 2/3 shit_B] workqueue: remove wq_numa_possible_cpumask Lai Jiangshan
2015-01-14  8:54                   ` [RFC PATCH 3/3 shit_B] workqueue: directly update attrs of pools when cpu hot[un]plug Lai Jiangshan
2015-01-16  5:22                   ` [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug Yasuaki Ishimatsu
2015-01-16  8:04                     ` Lai Jiangshan
2015-01-23  6:13                   ` Izumi, Taku
2015-01-23  8:18                     ` Lai Jiangshan [this message]
2015-01-14 13:57                 ` [PATCH 1/2] workqueue: update numa affinity info at node hotplug Tejun Heo
2015-01-15  1:23                   ` Lai Jiangshan
2014-12-16 16:51 ` [PATCH 2/2] workqueue: update cpumask at CPU_ONLINE if necessary Kamezawa Hiroyuki
  -- strict thread matches above, loose matches on Subject: below --
2015-01-14 12:05 [RFC PATCH 0/2 shit_A shit_B] workqueue: fix wq_numa bug Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54C203C7.4080009@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=guz.fnst@cn.fujitsu.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tangchen@cn.fujitsu.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.