From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 863A2C54E64 for ; Mon, 25 Mar 2024 14:25:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE6B56B0083; Mon, 25 Mar 2024 10:25:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E94876B0085; Mon, 25 Mar 2024 10:25:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5C7F6B0087; Mon, 25 Mar 2024 10:25:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C5FC16B0083 for ; Mon, 25 Mar 2024 10:25:02 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 12C3AC0809 for ; Mon, 25 Mar 2024 14:25:02 +0000 (UTC) X-FDA: 81935783244.08.7EC9852 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf04.hostedemail.com (Postfix) with ESMTP id F0DFE40018 for ; Mon, 25 Mar 2024 14:24:59 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=fUgtR5ap; spf=pass (imf04.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711376700; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=H8GadMAswCcqkzJqWtejRQESsFWaKvFT0Zyr1p8nSdY=; b=ao6fN6FOHpAIquULWAS48xyxvFR7X6kmZpsPZPnpdvQawCpq2IoLWp+j73oM79GXAMY2ig ga6uZUjN/U9Wa3hPlRL5nb4/bJZb5/eT6LZhCTNtyEcw+c0j0hpySSzQqEAPqsgKpYCwQk IPnVeZZZXu5tsiOLARXAhC7kRWSOdoU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=fUgtR5ap; spf=pass (imf04.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711376700; a=rsa-sha256; cv=none; b=8gc0kTi8D2w2iBla1ycXEV0cpmjR+QhPYhb3yYI4biB7lXV4Nj9sGJ4wMvJbg6aC9HXYkc gESULKlhlVW6TA2T30jg8Q0QaVWu7HFgnGWU9HSuSAlyzxQFKv6dVP/wTPLP9ORqy24ViB xXhzvO9QaW1HHctXL3aTdh08itboYPM= Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 42PBUTBs021565; Mon, 25 Mar 2024 14:24:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : mime-version; s=pp1; bh=H8GadMAswCcqkzJqWtejRQESsFWaKvFT0Zyr1p8nSdY=; b=fUgtR5ap1zPyMIj6daRDH9j/OkghuNTJUoCOGO9mTg9/7Mph9tiigybdrclcuBm0Dkhc uEWc/gVoMM4zN7wcZ5amBu+nUNPOguxFlrPluI0gzpcGgILvA76Y+f3Pd8U//Gf4/VW9 9KtFlTVYu2xZvCcaKlYmEN9jXfUqPLjfebwijF46/EfRd7rqPV+Bg//Rg9nMUDNQAMKx ShYNOfwi/r9eN6IWbiuavOj0xnuuZY23c2PbebXdDH6sByF/x/kVBC4+3H7WhWjUrolg SJSyDzgkkd7Tc+iuW9/ox0jKQh1eZ6PpRu3N/pFPHbtpiZqiU7ESxv2dPQhtdfT0t+b7 9w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x38eu8c32-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 25 Mar 2024 14:24:35 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 42PEOZ3j005355; Mon, 25 Mar 2024 14:24:35 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x38eu8c2y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 25 Mar 2024 14:24:35 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 42PBaEww025481; Mon, 25 Mar 2024 14:24:34 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3x2awmhh4e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 25 Mar 2024 14:24:34 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 42PEOUH244695848 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 Mar 2024 14:24:32 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 14FAB20040; Mon, 25 Mar 2024 14:24:30 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D08252004B; Mon, 25 Mar 2024 14:24:26 +0000 (GMT) Received: from ltczz402-lp1.aus.stglabs.ibm.com (unknown [9.53.171.174]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 25 Mar 2024 14:24:26 +0000 (GMT) From: Donet Tom To: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Aneesh Kumar , Huang Ying , Michal Hocko , Dave Hansen , Mel Gorman , Feng Tang , Andrea Arcangeli , Peter Zijlstra , Ingo Molnar , Rik van Riel , Johannes Weiner , Matthew Wilcox , Vlastimil Babka , Dan Williams , Hugh Dickins , Kefeng Wang , Suren Baghdasaryan , Donet Tom Subject: [PATCH v4 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Date: Mon, 25 Mar 2024 09:24:12 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: BV3XDWLCnGj00J2wBrC88t1VhTezyxbe X-Proofpoint-GUID: _8qFibs6pPHYfbpZZolyG_N0gqxqgl0s Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-25_09,2024-03-21_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 clxscore=1015 adultscore=0 spamscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2403210000 definitions=main-2403250079 X-Rspamd-Queue-Id: F0DFE40018 X-Rspam-User: X-Stat-Signature: ry47whcsxstidz3iipz3qkt1oe7pm3so X-Rspamd-Server: rspam01 X-HE-Tag: 1711376699-808906 X-HE-Meta: U2FsdGVkX18vSgf9GutpthggVqc1Mgri5KGr//beDLGH5b9YDiRNjTj5Vh3CcZH7EdABbYTbMLhwkm7OeJsqQC3vBk1DC0aLVNGhx9lxBjTPi6SVx11vODmKPRxpxAc+noJ8ca3WywPnBUMHirJ2Qi+zmKWCW/yX4cG04atuJRDDZtkwAYXLdry2y2dUacbw40IQ2vXrY5jIra49w6e3XzP2HSvmc1rmQwLq8pXZy9dzXukjzR6ntfJHm1hj6PeBuBetWsTkMlDucLBPktFRCthvz6+7KF/O5965/b5R68fDS5giOVs34jmDAOpChboIUDkIaA9DOFNiBucoKhiPC2qRSBW1mOeDNUZB6UJXtbm5YLMIuFAsfpaX7jj+e5n+fDSunArxcCxvLGkPOCql3QLGuljUd9MeJpN1kjfW2Q/POy5mBG7H/k5uKR2L+c5xJm2d3eAaTXJGInOZ6KiLot6f2aSpLaKkt3svbmzTF2G/Wl9ahaETPMji/sN3tTM66ek+2s5MDHZec4yIfBY9OCWP44RPIvuI1N71lB49XZo6+dwPB/RYh3MZXMPueoI1DC6LDvPb1LMRufuDlEGsgnH4bkxtWp84S4H/UY1VRKw/BQH1IbgGNlapbWPXsp2Tpxh27RZ5d3KLXE3kAxQwyv5J7nDyVXFFh4eQaUgizgxcD9GP7VDgAQ9illHs9C8AdzqAytIgUllKDWG4ZhueskEd5rexGbAqfAyM22wl0DZgQB69KDmdDRiM1Ensh5OZw+cNLFl/G2CIPhZ0rDfNHus67/Rb7HI2CthSMf/y2WKNSK/t8shHI29YikWkm0zNkq/ruzHdTjpwdaY4YqXNDv+31qyB7p0KVNPoLafzDbiHZsbxfSZo7F6+tkrjRATIURGTq+NmMuYzhRt+HAmPD5ggYNlx8ofutJHFHQ3+M3Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patchset is to optimize the cross-socket memory access with MPOL_PREFERRED_MANY policy. To test this patch we ran the following test on a 3 node system. Node 0 - 2GB - Tier 1 Node 1 - 11GB - Tier 1 Node 6 - 10GB - Tier 2 Below changes are made to memcached to set the memory policy, It select Node0 and Node1 as preferred nodes. #include #include unsigned long nodemask; int ret; nodemask = 0x03; ret = set_mempolicy(MPOL_PREFERRED_MANY | MPOL_F_NUMA_BALANCING, &nodemask, 10); /* If MPOL_F_NUMA_BALANCING isn't supported, * fall back to MPOL_PREFERRED_MANY */ if (ret < 0 && errno == EINVAL){ printf("set mem policy normal\n"); ret = set_mempolicy(MPOL_PREFERRED_MANY, &nodemask, 10); } if (ret < 0) { perror("Failed to call set_mempolicy"); exit(-1); } Test Procedure: =============== 1. Make sure memory tiering and demotion are enabled. 2. Start memcached. # ./memcached -b 100000 -m 204800 -u root -c 1000000 -t 7 -d -s "/tmp/memcached.sock" 3. Run memtier_benchmark to store 3200000 keys. #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary --threads=1 --pipeline=1 --ratio=1:0 --key-pattern=S:S --key-minimum=1 --key-maximum=3200000 -n allkeys -c 1 -R -x 1 -d 1024 4. Start a memory eater on node 0 and 1. This will demote all memcached pages to node 6. 5. Make sure all the memcached pages got demoted to lower tier by reading /proc//numa_maps. # cat /proc/2771/numa_maps --- default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64 default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64 --- 6. Kill memory eater. 7. Read the pgpromote_success counter. 8. Start reading the keys by running memtier_benchmark. #./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary --pipeline=1 --distinct-client-seed --ratio=0:3 --key-pattern=R:R --key-minimum=1 --key-maximum=3200000 -n allkeys --threads=64 -c 1 -R -x 6 9. Read the pgpromote_success counter. Test Results: ============= Without Patch ------------------ 1. pgpromote_success before test Node 0: pgpromote_success 11 Node 1: pgpromote_success 140974 pgpromote_success after test Node 0: pgpromote_success 11 Node 1: pgpromote_success 140974 2. Memtier-benchmark result. AGGREGATED AVERAGE RESULTS (6 runs) ================================================================== Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency ------------------------------------------------------------------ Sets 0.00 --- --- --- --- Gets 305792.03 305791.93 0.10 0.18949 0.16700 Waits 0.00 --- --- --- --- Totals 305792.03 305791.93 0.10 0.18949 0.16700 ====================================== p99 Latency p99.9 Latency KB/sec ------------------------------------- --- --- 0.00 0.44700 1.71100 11542.69 --- --- --- 0.44700 1.71100 11542.69 With Patch --------------- 1. pgpromote_success before test Node 0: pgpromote_success 5 Node 1: pgpromote_success 89386 pgpromote_success after test Node 0: pgpromote_success 57895 Node 1: pgpromote_success 141463 2. Memtier-benchmark result. AGGREGATED AVERAGE RESULTS (6 runs) ==================================================================== Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency -------------------------------------------------------------------- Sets 0.00 --- --- --- --- Gets 521942.24 521942.07 0.17 0.11459 0.10300 Waits 0.00 --- --- --- --- Totals 521942.24 521942.07 0.17 0.11459 0.10300 ======================================= p99 Latency p99.9 Latency KB/sec --------------------------------------- --- --- 0.00 0.23100 0.31900 19701.68 --- --- --- 0.23100 0.31900 19701.68 Test Result Analysis: ===================== 1. With patch we could observe pages are getting promoted. 2. Memtier-benchmark results shows that, with the patch, performance has increased more than 50%. Ops/sec without fix - 305792.03 Ops/sec with fix - 521942.24 Changes: V4 - Added an example in the "PATCH 2/2" commit message as per the discussion from V3. V3: - Added "* @vmf: structure describing the fault" comment for mpol_misplaced() to fix the warning. https://lore.kernel.org/oe-kbuild-all/202403202229.WZeAnUuO-lkp@intel.com/ -https://lore.kernel.org/lkml/cover.1711002865.git.donettom@linux.ibm.com/ v2: - Rebased on latest upstream (v6.8-rc7) - Used 'numa_node_id()' to get the current execution node ID, Added 'lockdep_assert_held' to make sure that the 'mpol_misplaced()' is called with ptl held. - The migration condition has been updated; now, migration will only occur if the execution node is present in the policy nodemask. -https://lore.kernel.org/lkml/cover.1709909210.git.donettom@linux.ibm.com/ -v1: https://lore.kernel.org/linux-mm/9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@linux.ibm.com/#t Donet Tom (2): mm/mempolicy: Use numa_node_id() instead of cpu_to_node() mm/numa_balancing:Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy include/linux/mempolicy.h | 5 +++-- mm/huge_memory.c | 2 +- mm/internal.h | 2 +- mm/memory.c | 8 +++++--- mm/mempolicy.c | 36 +++++++++++++++++++++++++++--------- 5 files changed, 37 insertions(+), 16 deletions(-) -- 2.39.3