From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) by kanga.kvack.org (Postfix) with ESMTP id 984106B00D3 for ; Thu, 13 Nov 2014 20:01:15 -0500 (EST) Received: by mail-wi0-f176.google.com with SMTP id ex7so1225283wid.3 for ; Thu, 13 Nov 2014 17:01:15 -0800 (PST) Received: from kirsi1.inet.fi (mta-out1.inet.fi. [62.71.2.195]) by mx.google.com with ESMTP id i1si1326867wiy.105.2014.11.13.17.01.14 for ; Thu, 13 Nov 2014 17:01:14 -0800 (PST) Date: Fri, 14 Nov 2014 02:58:33 +0200 From: "Kirill A. Shutemov" Subject: Re: [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2. Message-ID: <20141114005833.GA1572@node.dhcp.inet.fi> References: <1415644096-3513-1-git-send-email-j.glisse@gmail.com> <1415644096-3513-4-git-send-email-j.glisse@gmail.com> <20141110205814.GA4186@gmail.com> <20141110225036.GB4186@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Linus Torvalds Cc: Jerome Glisse , Andrew Morton , Linux Kernel Mailing List , linux-mm , Joerg Roedel , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Rik van Riel , Dave Airlie , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Mark Hairgrove , Lucien Dunning , Cameron Buschardt , Arvind Gopalakrishnan , Shachar Raindel , Liran Liss , Roland Dreier , Ben Sander , Greg Stoner , John Bridgman , Michael Mantor , Paul Blinzer , Laurent Morichetti , Alexander Deucher , Oded Gabbay , =?iso-8859-1?B?Suly9G1l?= Glisse On Thu, Nov 13, 2014 at 03:50:02PM -0800, Linus Torvalds wrote: > +/* > + * The 'tree_level' data only describes one particular level > + * of the tree. The upper levels are totally invisible to the > + * user of the tree walker, since the tree walker will walk > + * those using the tree definitions. > + * > + * NOTE! "struct tree_entry" is an opaque type, and is just a > + * used as a pointer to the particular level. You can figure > + * out which level you are at by looking at the "tree_level", > + * but even better is to just use different "lookup()" > + * functions for different levels, at which point the > + * function is inherent to the level. Please, don't. We will end up with the same last-level centric code as we have now in mm subsystem: all code only cares about pte. It makes implementing variable page size support really hard and lead to copy-paste approach. And to hugetlb parallel world... It would be nice to have tree_level description generic enough to get rid of pte_present()/pte_dirty()/pte_* and implement generic helpers instead. Apart from variable page size problem, we could get one day support different CPU page table format supported in runtime: PAE/non-PAE on 32-bit x86 or LPAE/non-LPAE on ARM in one binary kernel image. The big topic is how to get it done without significant runtime cost :-/ -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org