From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f197.google.com (mail-pf0-f197.google.com [209.85.192.197]) by kanga.kvack.org (Postfix) with ESMTP id 6AC5B82F64 for ; Tue, 30 Aug 2016 00:44:46 -0400 (EDT) Received: by mail-pf0-f197.google.com with SMTP id w128so21082592pfd.3 for ; Mon, 29 Aug 2016 21:44:46 -0700 (PDT) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com. [148.163.156.1]) by mx.google.com with ESMTPS id qy6si43115999pab.154.2016.08.29.21.44.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 29 Aug 2016 21:44:43 -0700 (PDT) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u7U4hvwr078351 for ; Tue, 30 Aug 2016 00:44:43 -0400 Received: from e23smtp01.au.ibm.com (e23smtp01.au.ibm.com [202.81.31.143]) by mx0a-001b2d01.pphosted.com with ESMTP id 2553620ddn-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 30 Aug 2016 00:44:42 -0400 Received: from localhost by e23smtp01.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 30 Aug 2016 14:44:40 +1000 Received: from d23relay08.au.ibm.com (d23relay08.au.ibm.com [9.185.71.33]) by d23dlp01.au.ibm.com (Postfix) with ESMTP id 9F7C32CE8056 for ; Tue, 30 Aug 2016 14:44:38 +1000 (EST) Received: from d23av06.au.ibm.com (d23av06.au.ibm.com [9.190.235.151]) by d23relay08.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u7U4icVp3473714 for ; Tue, 30 Aug 2016 14:44:38 +1000 Received: from d23av06.au.ibm.com (localhost [127.0.0.1]) by d23av06.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u7U4icGC030008 for ; Tue, 30 Aug 2016 14:44:38 +1000 Date: Tue, 30 Aug 2016 10:14:25 +0530 From: Anshuman Khandual MIME-Version: 1.0 Subject: Re: [PATCH] thp: reduce usage of huge zero page's atomic counter References: <20160829155021.2a85910c3d6b16a7f75ffccd@linux-foundation.org> <36b76a95-5025-ac64-0862-b98b2ebdeaf7@intel.com> <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org> In-Reply-To: <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Message-Id: <57C50F29.4070309@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Aaron Lu Cc: Linux Memory Management List , "'Kirill A. Shutemov'" , Dave Hansen , Tim Chen , Huang Ying , Vlastimil Babka , Jerome Marchand , Andrea Arcangeli , Mel Gorman , Ebru Akagunduz , linux-kernel@vger.kernel.org On 08/30/2016 09:09 AM, Andrew Morton wrote: > On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu wrote: > >>>> Case used for test on Haswell EP: >>>> usemem -n 72 --readonly -j 0x200000 100G >>>> Which spawns 72 processes and each will mmap 100G anonymous space and >>>> then do read only access to that space sequentially with a step of 2MB. >>>> >>>> perf report for base commit: >>>> 54.03% usemem [kernel.kallsyms] [k] get_huge_zero_page >>>> perf report for this commit: >>>> 0.11% usemem [kernel.kallsyms] [k] mm_get_huge_zero_page >>> >>> Does this mean that overall usemem runtime halved? >> >> Sorry for the confusion, the above line is extracted from perf report. >> It shows the percent of CPU cycles executed in a specific function. >> >> The above two perf lines are used to show get_huge_zero_page doesn't >> consume that much CPU cycles after applying the patch. >> >>> >>> Do we have any numbers for something which is more real-wordly? >> >> Unfortunately, no real world numbers. >> >> We think the global atomic counter could be an issue for performance >> so I'm trying to solve the problem. > > So, umm, we don't actually know if the patch is useful to anyone? On a POWER system it improves the CPU consumption of the above mentioned function a little bit. Dont think its going to improve actual throughput of the workload substantially. 0.07% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page to 0.01% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752015AbcH3Eos (ORCPT ); Tue, 30 Aug 2016 00:44:48 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:35009 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751636AbcH3Eoq (ORCPT ); Tue, 30 Aug 2016 00:44:46 -0400 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: khandual@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Tue, 30 Aug 2016 10:14:25 +0530 From: Anshuman Khandual User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Andrew Morton , Aaron Lu CC: Linux Memory Management List , "'Kirill A. Shutemov'" , Dave Hansen , Tim Chen , Huang Ying , Vlastimil Babka , Jerome Marchand , Andrea Arcangeli , Mel Gorman , Ebru Akagunduz , linux-kernel@vger.kernel.org Subject: Re: [PATCH] thp: reduce usage of huge zero page's atomic counter References: <20160829155021.2a85910c3d6b16a7f75ffccd@linux-foundation.org> <36b76a95-5025-ac64-0862-b98b2ebdeaf7@intel.com> <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org> In-Reply-To: <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16083004-0012-0000-0000-000001BED390 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16083004-0013-0000-0000-000005E31D09 Message-Id: <57C50F29.4070309@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-08-30_02:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1608300044 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/30/2016 09:09 AM, Andrew Morton wrote: > On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu wrote: > >>>> Case used for test on Haswell EP: >>>> usemem -n 72 --readonly -j 0x200000 100G >>>> Which spawns 72 processes and each will mmap 100G anonymous space and >>>> then do read only access to that space sequentially with a step of 2MB. >>>> >>>> perf report for base commit: >>>> 54.03% usemem [kernel.kallsyms] [k] get_huge_zero_page >>>> perf report for this commit: >>>> 0.11% usemem [kernel.kallsyms] [k] mm_get_huge_zero_page >>> >>> Does this mean that overall usemem runtime halved? >> >> Sorry for the confusion, the above line is extracted from perf report. >> It shows the percent of CPU cycles executed in a specific function. >> >> The above two perf lines are used to show get_huge_zero_page doesn't >> consume that much CPU cycles after applying the patch. >> >>> >>> Do we have any numbers for something which is more real-wordly? >> >> Unfortunately, no real world numbers. >> >> We think the global atomic counter could be an issue for performance >> so I'm trying to solve the problem. > > So, umm, we don't actually know if the patch is useful to anyone? On a POWER system it improves the CPU consumption of the above mentioned function a little bit. Dont think its going to improve actual throughput of the workload substantially. 0.07% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page to 0.01% usemem [kernel.vmlinux] [k] mm_get_huge_zero_page