From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751404AbbCIDR3 (ORCPT ); Sun, 8 Mar 2015 23:17:29 -0400 Received: from cantor2.suse.de ([195.135.220.15]:43455 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751111AbbCIDR1 (ORCPT ); Sun, 8 Mar 2015 23:17:27 -0400 Message-ID: <54FD10BF.3010709@suse.cz> Date: Mon, 09 Mar 2015 04:17:19 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Davidlohr Bueso CC: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Hugh Dickins , Andrea Arcangeli , "Kirill A. Shutemov" , Rik van Riel , Mel Gorman , Michal Hocko , Ebru Akagunduz , Alex Thorlton , David Rientjes , Peter Zijlstra , Ingo Molnar Subject: Re: [RFC 0/6] the big khugepaged redesign References: <1424696322-21952-1-git-send-email-vbabka@suse.cz> <1424731603.6539.51.camel@stgolabs.net> In-Reply-To: <1424731603.6539.51.camel@stgolabs.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/23/2015 11:46 PM, Davidlohr Bueso wrote: > On Mon, 2015-02-23 at 13:58 +0100, Vlastimil Babka wrote: >> Recently, there was concern expressed (e.g. [1]) whether the quite aggressive >> THP allocation attempts on page faults are a good performance trade-off. >> >> - THP allocations add to page fault latency, as high-order allocations are >> notoriously expensive. Page allocation slowpath now does extra checks for >> GFP_TRANSHUGE && !PF_KTHREAD to avoid the more expensive synchronous >> compaction for user page faults. But even async compaction can be expensive. >> - During the first page fault in a 2MB range we cannot predict how much of the >> range will be actually accessed - we can theoretically waste as much as 511 >> worth of pages [2]. Or, the pages in the range might be accessed from CPUs >> from different NUMA nodes and while base pages could be all local, THP could >> be remote to all but one CPU. The cost of remote accesses due to this false >> sharing would be higher than any savings on the TLB. >> - The interaction with memcg are also problematic [1]. >> >> Now I don't have any hard data to show how big these problems are, and I >> expect we will discuss this on LSF/MM (and hope somebody has such data [3]). >> But it's certain that e.g. SAP recommends to disable THPs [4] for their apps >> for performance reasons. > > There are plenty of examples of this, ie for Oracle: > > https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge > http://oracle-base.com/articles/linux/configuring-huge-pages-for-oracle-on-linux-64.php Just stumbled upon more references when catching up on lwn: http://lwn.net/Articles/634797/