From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753599AbbCBIwt (ORCPT ); Mon, 2 Mar 2015 03:52:49 -0500 Received: from mgwkm02.jp.fujitsu.com ([202.219.69.169]:26656 "EHLO mgwkm02.jp.fujitsu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750911AbbCBIws (ORCPT ); Mon, 2 Mar 2015 03:52:48 -0500 X-Greylist: delayed 612 seconds by postgrey-1.27 at vger.kernel.org; Mon, 02 Mar 2015 03:52:47 EST X-SecurityPolicyCheck: OK by SHieldMailChecker v2.2.3 X-SHieldMailCheckerPolicyVersion: FJ-ISEC-20140219-2 Message-ID: <54F42221.6000904@jp.fujitsu.com> Date: Mon, 2 Mar 2015 17:41:05 +0900 From: Kamezawa Hiroyuki User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Tejun Heo , Gu Zheng CC: , , , Subject: Re: [PATCH] workqueue: update numa affinity when node hotplug References: <1425031492-32300-1-git-send-email-guz.fnst@cn.fujitsu.com> <20150227115401.GD3964@htj.duckdns.org> In-Reply-To: <20150227115401.GD3964@htj.duckdns.org> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-SecurityPolicyCheck-GC: OK by FENCE-Mail X-TM-AS-MML: disable Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2015/02/27 20:54, Tejun Heo wrote: > Hello, > > On Fri, Feb 27, 2015 at 06:04:52PM +0800, Gu Zheng wrote: >> Yasuaki Ishimatsu found that with node online/offline, cpu<->node >> relationship is established. Because workqueue uses a info which was >> established at boot time, but it may be changed by node hotpluging. > > I've asked this a couple times now but can somebody please justify why > cpu<->node relationship needs to change? If it has to change, that's > okay but let's please first make sure that we understand why such > thing is necessary so that we can figure out what kind of facilities > are necessary for such dynamism. > > Thanks. > Let me start from explaining current behavior. - cpu-id is determined when a new processor(lapicid/x2apicid) is founded. cpu-id<->nodeid relationship is _not_ recorded. - node-id is determined when a new pxm(firmware info) is founded. pxm<->nodeid relationship is recorded. By this, there are 2 cases of cpu<->nodeid change. Case A) In x86, cpus on memory-less nodes are all tied to existing nodes(round robin). At memory-hotadd happens and a new node comes, cpus are moved to a newly added node based on pxm. Case B) Adding a node after removing another node, if pxm of them were different from each other, cpu<->node relatiionship changes. I personally thinks proper fix is building persistent cpu-id <-> lapicid relationship as pxm does rather than creating band-aid. But I have no good idea to hanlde case A), which is used now. Hm, Case-A will not be problem for the case of workqueue's kmalloc because a node with ZONE_NORMAL cannot be removed. Thanks, -Kame