From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752015AbcH3Eos (ORCPT <rfc822;w@1wt.eu>);
        Tue, 30 Aug 2016 00:44:48 -0400
Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:35009 "EHLO
        mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S1751636AbcH3Eoq (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 30 Aug 2016 00:44:46 -0400
X-IBM-Helo: d23dlp02.au.ibm.com
X-IBM-MailFrom: khandual@linux.vnet.ibm.com
X-IBM-RcptTo: linux-kernel@vger.kernel.org
Date: Tue, 30 Aug 2016 10:14:25 +0530
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Andrew Morton <akpm@linux-foundation.org>,
        Aaron Lu <aaron.lu@intel.com>
CC: Linux Memory Management List <linux-mm@kvack.org>,
        "'Kirill A. Shutemov'" <kirill.shutemov@linux.intel.com>,
        Dave Hansen <dave.hansen@intel.com>,
        Tim Chen <tim.c.chen@linux.intel.com>,
        Huang Ying <ying.huang@intel.com>, Vlastimil Babka <vbabka@suse.cz>,
        Jerome Marchand <jmarchan@redhat.com>,
        Andrea Arcangeli <aarcange@redhat.com>,
        Mel Gorman <mgorman@techsingularity.net>,
        Ebru Akagunduz <ebru.akagunduz@gmail.com>,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH] thp: reduce usage of huge zero page's atomic counter
References: <b7e47f2c-8aac-156a-f627-a50db31220f8@intel.com>        <20160829155021.2a85910c3d6b16a7f75ffccd@linux-foundation.org>        <36b76a95-5025-ac64-0862-b98b2ebdeaf7@intel.com> <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org>
In-Reply-To: <20160829203916.6a2b45845e8fb0c356cac17d@linux-foundation.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16083004-0012-0000-0000-000001BED390
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 16083004-0013-0000-0000-000005E31D09
Message-Id: <57C50F29.4070309@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-08-30_02:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000
 definitions=main-1608300044
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 08/30/2016 09:09 AM, Andrew Morton wrote:
> On Tue, 30 Aug 2016 11:09:15 +0800 Aaron Lu <aaron.lu@intel.com> wrote:
> 
>>>> Case used for test on Haswell EP:
>>>> usemem -n 72 --readonly -j 0x200000 100G
>>>> Which spawns 72 processes and each will mmap 100G anonymous space and
>>>> then do read only access to that space sequentially with a step of 2MB.
>>>>
>>>> perf report for base commit:
>>>>     54.03%  usemem   [kernel.kallsyms]   [k] get_huge_zero_page
>>>> perf report for this commit:
>>>>      0.11%  usemem   [kernel.kallsyms]   [k] mm_get_huge_zero_page
>>>
>>> Does this mean that overall usemem runtime halved?
>>
>> Sorry for the confusion, the above line is extracted from perf report.
>> It shows the percent of CPU cycles executed in a specific function.
>>
>> The above two perf lines are used to show get_huge_zero_page doesn't
>> consume that much CPU cycles after applying the patch.
>>
>>>
>>> Do we have any numbers for something which is more real-wordly?
>>
>> Unfortunately, no real world numbers.
>>
>> We think the global atomic counter could be an issue for performance
>> so I'm trying to solve the problem.
> 
> So, umm, we don't actually know if the patch is useful to anyone?

On a POWER system it improves the CPU consumption of the above mentioned
function a little bit. Dont think its going to improve actual throughput
of the workload substantially.

0.07%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page

to

0.01%  usemem  [kernel.vmlinux]  [k] mm_get_huge_zero_page