From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail190.messagelabs.com (mail190.messagelabs.com [216.82.249.51]) by kanga.kvack.org (Postfix) with SMTP id 027A06003C1 for ; Tue, 26 Jan 2010 10:48:10 -0500 (EST) Date: Tue, 26 Jan 2010 09:47:51 -0600 (CST) From: Christoph Lameter Subject: Re: [PATCH 00 of 30] Transparent Hugepage support #3 In-Reply-To: <20100125224643.GA30452@random.random> Message-ID: References: <20100122151947.GA3690@random.random> <20100123175847.GC6494@random.random> <20100125224643.GA30452@random.random> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org To: Andrea Arcangeli Cc: linux-mm@kvack.org, Marcelo Tosatti , Adam Litke , Avi Kivity , Izik Eidus , Hugh Dickins , Nick Piggin , Rik van Riel , Mel Gorman , Andi Kleen , Dave Hansen , Benjamin Herrenschmidt , Ingo Molnar , Mike Travis , KAMEZAWA Hiroyuki , Chris Wright , Andrew Morton List-ID: On Mon, 25 Jan 2010, Andrea Arcangeli wrote: > On Mon, Jan 25, 2010 at 03:50:31PM -0600, Christoph Lameter wrote: > > There is always VM activity, so we need 512 pointers sigh. > > well you said some week ago that actual systems never swap and swap is > useless... if they don't swap there will be just 1 pointer in the > pmd. The mprotect/mremap we want to learn using pmd_trans_huge > natively without split but again, this is incremental work. I have to disable swap to be able to make use of these huge pages? > > So its not possible to use these "huge" pages in a useful way inside of > > the kernel. They are volatile and temporary. > > They are so useless that firefox never splits them, this is my > laptop. khugepaged running so if there's swapout, after swapin they > will be collapsed back into hugepages. Just because your configuration did not split does not mean that there is a guarantee of them not splitting. You need to guarantee that the VM does not split them in order to be able to safely refer to them from code (like I/O paths). > > In short they cannot be treated as 2M entities unless we add some logic to > > prevent splitting. > > They can on the physical side, splitting only involves the virtual > side, this is why O_DIRECT DMA through gup already works on hugepages > without splitting them. Earlier you stated that reclaim can remove 4k pieces of huge pages after a split. How does gup keep the huge pages stable while doing I/O? Does gup submit 512 pointers to 4k chunks or 1 pointer to a 2M chunk? > Just send me patches to remove all callers of split_huge_page, then > split_huge_page can go away too. But saying that hugepages aren't > useful already is absurd, kvm with "madvise" default of sysfs already > gets the full benefit, nothing more can be achieved by kvm in This implementation seems to only address the TLB pressure issue but not the scaling issue that arises because we have to handle data in 4k chunks (512 4k pointers instead of one 2M pointer). Scaling is not addressed because complex fallback logic sabotages a basic benefit of huge pages. > performance and functionality than what my patch delivers already > (ok swapping will be a little more efficient if done through 2M I/O > but swap performance isn't so critical). Our objective is to over time > eliminate the need of split_huge_page. khugepaged will remain required Ok then establish some way to make these huge pages stable. > forever, unless the whole kernel ram will become relocatable and > defrag not just an heuristic but a guarantee (it is needed after one > VM exits and release several gigs of hugepages, so the other VM get > the speedup). That all depends on what you mean by guarantee I guess. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org