From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail191.messagelabs.com (mail191.messagelabs.com [216.82.242.19]) by kanga.kvack.org (Postfix) with ESMTP id B44036B01AE for ; Mon, 22 Mar 2010 14:21:01 -0400 (EDT) Date: Mon, 22 Mar 2010 19:20:28 +0100 From: Johannes Weiner Subject: Re: [PATCH 00 of 34] Transparent Hugepage support #14 Message-ID: <20100322182028.GA13114@cmpxchg.org> References: <20100318234923.GV29874@random.random> <20100319144101.GB29874@random.random> <20100322163523.GA12407@cmpxchg.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: Andrea Arcangeli , linux-mm@kvack.org, Marcelo Tosatti , Adam Litke , Avi Kivity , Izik Eidus , Hugh Dickins , Nick Piggin , Rik van Riel , Mel Gorman , Dave Hansen , Benjamin Herrenschmidt , Ingo Molnar , Mike Travis , KAMEZAWA Hiroyuki , Chris Wright , bpicco@redhat.com, KOSAKI Motohiro , Balbir Singh , Arnd Bergmann , "Michael S. Tsirkin" , Peter Zijlstra List-ID: On Mon, Mar 22, 2010 at 11:46:01AM -0500, Christoph Lameter wrote: > On Mon, 22 Mar 2010, Johannes Weiner wrote: > > > > entries while walking the page tables! Go incrementally use what > > > is there. > > > > That only works if you merely read the tables. If the VMA gets broken > > up in the middle of a huge page, you definitely have to map ptes again. > > Yes then follow the established system for remapping stuff. This is not comparable. The existing remapping code does not have to deal with structural changes in the page tables. > > And as already said, allowing it to happen always-succeeding and > > atomically allows to switch users step by step. > > It results in a volatility in the page table entries that requires new > synchronization procedures. It also increases the difficulty in > establishing a reliable state of the pages / page tables for > operations since there is potentially on-the-fly atomic conversion > wizardry going on. I really fail to see how. Right now, when you want a stable entry, you take the pte lock. This is exactly the same for huge page entries. Migration happens on the fly as well, you can just hide it better for cases like mremap(), which is unaffected by the actual entries it copies. It becomes only visible at sites affected by the entries themselves, which are not that many, the fault handler e.g. It is much easier to keep that stuff centralized. But pmd splitting affects everyone sensitive to the page table structure itself and that ends up being everyone, well, walking page tables. > > That sure sounds more incremental to me than being required to do > > non-trivial adjustments to all the places at once! > > You do not need to do this all at once. Again the huge page subsystem has > been around for years and we have established mechanisms to move/remap. > There nothing hindering us from implementing huge page -> regular page > conversion using the known methods or also implementing explicit huge page > support in more portions of the kernel. I see the real complexity in actually dealing with dynamically changing page table structure and none of the existing code does that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org