From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner+w=401wt.eu-S932424AbXDAKrJ@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932424AbXDAKrJ (ORCPT <rfc822;w@1wt.eu>);
	Sun, 1 Apr 2007 06:47:09 -0400
Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932433AbXDAKrJ
	(ORCPT <rfc822;linux-kernel-outgoing>);
	Sun, 1 Apr 2007 06:47:09 -0400
Received: from mail.suse.de ([195.135.220.2]:59266 "EHLO mx1.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932424AbXDAKrH (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Sun, 1 Apr 2007 06:47:07 -0400
From: Andi Kleen <ak@suse.de>
Organization: SUSE Linux Products GmbH, Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)
To: Christoph Lameter <clameter@sgi.com>
Subject: Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL
Date: Sun, 1 Apr 2007 12:46:51 +0200
User-Agent: KMail/1.9.5
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
       Martin Bligh <mbligh@google.com>, linux-mm@kvack.org,
       KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
References: <20070401071024.23757.4113.sendpatchset@schroedinger.engr.sgi.com> <20070401071029.23757.78021.sendpatchset@schroedinger.engr.sgi.com>
In-Reply-To: <20070401071029.23757.78021.sendpatchset@schroedinger.engr.sgi.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200704011246.52238.ak@suse.de>
Sender: linux-kernel-owner@vger.kernel.org
X-Mailing-List: linux-kernel@vger.kernel.org

On Sunday 01 April 2007 09:10, Christoph Lameter wrote:
> x86_64 make SPARSE_VIRTUAL the default
> 
> x86_64 is using 2M page table entries to map its 1-1 kernel space.
> We implement the virtual memmap also using 2M page table entries.
> So there is no difference at all to FLATMEM. Both schemes require
> a page table and a TLB.

Hmm, this means there is at least 2MB worth of struct page on every node?
Or do you have overlaps with other memory (I think you have)
In that case you have to handle the overlap in change_page_attr()

Also your "generic" vmemmap code doesn't look very generic, but
rather x86 specific. I didn't think huge pages could be easily
set up this way in many other architectures.  

And when you reserve virtual space somewhere you should 
update Documentation/x86_64/mm.txt. Also you didn't adjust 
the end of the vmalloc area so in theory vmalloc could run
into your vmemmap.

> Thus the SPARSEMEM becomes the most efficient way of handling
> virt_to_page, pfn_to_page and friends for UP, SMP and NUMA.

Do you have any benchmarks numbers to prove it? There seem to be a few
benchmarks where the discontig virt_to_page is a problem
(although I know ways to make it more efficient), and sparsemem
is normally slower. Still some numbers would be good.

-Andi