From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0125CA9ECE for ; Fri, 1 Nov 2019 07:58:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BFA4B2080F for ; Fri, 1 Nov 2019 07:58:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BFA4B2080F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F3166B0008; Fri, 1 Nov 2019 03:58:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 17B016B000A; Fri, 1 Nov 2019 03:58:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE28F6B000C; Fri, 1 Nov 2019 03:58:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0036.hostedemail.com [216.40.44.36]) by kanga.kvack.org (Postfix) with ESMTP id C58426B0008 for ; Fri, 1 Nov 2019 03:58:29 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 71A532A87 for ; Fri, 1 Nov 2019 07:58:29 +0000 (UTC) X-FDA: 76106956338.17.bulb08_fd5564184437 X-HE-Tag: bulb08_fd5564184437 X-Filterd-Recvd-Size: 3710 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Fri, 1 Nov 2019 07:58:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Nov 2019 00:58:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,254,1569308400"; d="scan'208";a="225962470" Received: from yhuang-dev.sh.intel.com ([10.239.159.29]) by fmsmga004.fm.intel.com with ESMTP; 01 Nov 2019 00:58:24 -0700 From: "Huang, Ying" To: Peter Zijlstra Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Andrew Morton , Michal Hocko , Rik van Riel , Mel Gorman , Ingo Molnar , Dave Hansen , Dan Williams , Fengguang Wu Subject: [RFC 02/10] autonuma: Reduce cache footprint when scanning page tables Date: Fri, 1 Nov 2019 15:57:19 +0800 Message-Id: <20191101075727.26683-3-ying.huang@intel.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191101075727.26683-1-ying.huang@intel.com> References: <20191101075727.26683-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Huang Ying In auto NUMA balancing page table scanning, if the pte_protnone() is true, the PTE needs not to be changed because it's in target state already. So other checking on corresponding struct page is unnecessary too. So, if we check pte_protnone() firstly for each PTE, we can avoid unnecessary struct page accessing, so that reduce the cache footprint of NUMA balancing page table scanning. In the performance test of pmbench memory accessing benchmark with 80:20 read/write ratio and normal access address distribution on a 2 socket Intel server with Optance DC Persistent Memory, perf profiling shows that the autonuma page table scanning time reduces from 1.23% to 0.97% (that is, reduced 21%) with the patch. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc: Fengguang Wu Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- mm/mprotect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..d69b9913388e 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -80,6 +80,10 @@ static unsigned long change_pte_range(struct vm_area_s= truct *vma, pmd_t *pmd, if (prot_numa) { struct page *page; =20 + /* Avoid TLB flush if possible */ + if (pte_protnone(oldpte)) + continue; + page =3D vm_normal_page(vma, addr, oldpte); if (!page || PageKsm(page)) continue; @@ -97,10 +101,6 @@ static unsigned long change_pte_range(struct vm_area_= struct *vma, pmd_t *pmd, if (page_is_file_cache(page) && PageDirty(page)) continue; =20 - /* Avoid TLB flush if possible */ - if (pte_protnone(oldpte)) - continue; - /* * Don't mess with PTEs if page is already on the node * a single-threaded process is running on. --=20 2.23.0