From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932196AbcJMM2V (ORCPT ); Thu, 13 Oct 2016 08:28:21 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:54784 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754421AbcJMM2J (ORCPT ); Thu, 13 Oct 2016 08:28:09 -0400 Date: Thu, 13 Oct 2016 16:28:27 +0530 From: Anshuman Khandual User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Michal Hocko CC: Mel Gorman , Linux Kernel Mailing List , Linux Memory Management List , Andrew Morton , "Aneesh Kumar K.V" , Balbir Singh , Vlastimil Babka , Minchan Kim Subject: Re: MPOL_BIND on memory only nodes References: <57FE0184.6030008@linux.vnet.ibm.com> <20161012094337.GH17128@dhcp22.suse.cz> <20161012131626.GL17128@dhcp22.suse.cz> <57FF59EE.9050508@linux.vnet.ibm.com> <20161013100708.GI21678@dhcp22.suse.cz> In-Reply-To: <20161013100708.GI21678@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16101310-0012-0000-0000-000001D669D4 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16101310-0013-0000-0000-000006348ABF Message-Id: <57FF68D3.5030507@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-13_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610130188 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/13/2016 03:37 PM, Michal Hocko wrote: > On Thu 13-10-16 15:24:54, Anshuman Khandual wrote: > [...] >> Which makes the function look like this. Even with these changes, MPOL_BIND is >> still going to pick up the local node's zonelist instead of the first node in >> policy->v.nodes nodemask. It completely ignores policy->v.nodes which it should >> not. > > Not really. I have tried to explain earlier. We do not ignore policy > nodemask. This one comes from policy_nodemask. We start with the local > node but fallback to some of the nodes from the nodemask defined by the > policy. > Yeah saw your response but did not get that exactly. We dont ignore policy nodemask while memory allocation, correct. But my point was we are ignoring policy nodemask while selecting zonelist which will be used during page allocation. Though the zone contents of both the zonelists are likely to be same, would not it be better to get the zone list from the nodemask as well ? Or I am still missing something here. The following change is what I am trying to propose. diff --git a/mm/mempolicy.c b/mm/mempolicy.c index ad1c96a..f60ab80 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1685,14 +1685,7 @@ static struct zonelist *policy_zonelist(gfp_t gfp, struct mempolicy *policy, nd = policy->v.preferred_node; break; case MPOL_BIND: - /* - * Normally, MPOL_BIND allocations are node-local within the - * allowed nodemask. However, if __GFP_THISNODE is set and the - * current node isn't part of the mask, we use the zonelist for - * the first node in the mask instead. - */ - if (unlikely(gfp & __GFP_THISNODE) && - unlikely(!node_isset(nd, policy->v.nodes))) + if (unlikely(!node_isset(nd, policy->v.nodes))) nd = first_node(policy->v.nodes); break; default: