From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932721AbcJLJZn (ORCPT ); Wed, 12 Oct 2016 05:25:43 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:40512 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932426AbcJLJZf (ORCPT ); Wed, 12 Oct 2016 05:25:35 -0400 Date: Wed, 12 Oct 2016 14:55:24 +0530 From: Anshuman Khandual User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Linux Kernel Mailing List CC: Linux Memory Management List , Andrew Morton , Mel Gorman , Michal Hocko , "Aneesh Kumar K.V" , Balbir Singh , Vlastimil Babka , Minchan Kim Subject: MPOL_BIND on memory only nodes Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16101209-0004-0000-0000-000001A54259 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16101209-0005-0000-0000-000008DE3427 Message-Id: <57FE0184.6030008@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-12_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610120156 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, We have the following function policy_zonelist() which selects a zonelist during various allocation paths. With this, general user space allocations (IIUC might not have __GFP_THISNODE) fails while trying to get memory from a memory only node without CPUs as the application runs some where else and that node is not part of the nodemask. Why we insist on __GFP_THISNODE ? On any memory only node its likely that the local node "nd" might not be part of the nodemask, hence does it make sense to pick up the first node of the nodemask in those cases without looking for __GFP_THISNODE ? /* Return a zonelist indicated by gfp for node representing a mempolicy */ static struct zonelist *policy_zonelist(gfp_t gfp, struct mempolicy *policy, int nd) { switch (policy->mode) { case MPOL_PREFERRED: if (!(policy->flags & MPOL_F_LOCAL)) nd = policy->v.preferred_node; break; case MPOL_BIND: /* * Normally, MPOL_BIND allocations are node-local within the * allowed nodemask. However, if __GFP_THISNODE is set and the * current node isn't part of the mask, we use the zonelist for * the first node in the mask instead. */ if (unlikely(gfp & __GFP_THISNODE) && unlikely(!node_isset(nd, policy->v.nodes))) nd = first_node(policy->v.nodes); break; default: BUG(); } return node_zonelist(nd, gfp); } - Anshuman