public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>, Andi Kleen <ak@suse.de>,
	Paul Mackerras <paulus@samba.org>, Andrew Morton <akpm@osdl.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Randy.Dunlap" <rdunlap@xenotime.net>,
	"Protasevich, Natalie" <Natalie.Protasevich@unisys.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Andy Whitcroft <apw@shadowen.org>
Subject: Re: [PATCH] x86_64: Make NR_IRQS configurable in Kconfig
Date: Thu, 10 Aug 2006 14:55:38 +0200	[thread overview]
Message-ID: <1155214538.14749.54.camel@localhost> (raw)
In-Reply-To: <1155147948.19249.171.camel@localhost.localdomain>

On Wed, 2006-08-09 at 11:25 -0700, Dave Hansen wrote: 
> Instead of:
> 
> #define pfn_to_section_nr(pfn) ((pfn) >> PFN_SECTION_SHIFT)
> 
> We could do:
> 
> static inline unsigned long pfn_to_section_nr(unsigned long pfn)
> {
> 	return some_hash(pfn) % NR_OF_SECTION_SLOTS;
> }
> 
> This would, of course, still have limits on how _many_ sections can be
> populated.  But, it would remove the relationship on what the actual
> physical address ranges can be from the number of populated sections.
> 
> Of course, it isn't quite that simple.  You need to make sure that the
> sparse code is clean from all connections between section number and
> physical address, as well as handling things like hash collisions.  We'd
> probably also need to store the _actual_ physical address somewhere
> because we can't get it from the section number any more.

You have to deal with the hash collisions somehow, for example with a
list of pages that have the same hash. And you have to calculate the
hash value. Both hurts performance.

> P.S. With sparsemem extreme, I think you can cover an entire 64-bits of
> address space with a 4GB top-level table.  If one more level of tables
> was added, we'd be down to (I think) an 8MB table.  So, that might be an
> option, too.

On s390 we have to prepare for the situation of an address space that
has a chunk of memory at the low end and another chunk with bit 2^63
set. So the mem_map array needs to cover the whole 64 bit address range.
For sparsemem, we can choose on the size of the mem_map sections and on
how many indirections the lookup table should have. Some examples:

1) flat mem_map array: 2^52 entries, 56 bytes each.
2) mem_map sections with 256 entries / 14KB for each section,
   1 indirection level, 2^44 indirection pointers, 128TB overhead
3) mem_map sections with 256 entries / 14KB for each section,
   2 indirection levels, 2^22 indirection pointers for each level,
   32MB for each indirection array, minimum 64MB overhead
4) mem_map sections with 256 entries / 14KB for each section,
   3 indirection levels, 2^15/2^15/2^14 indirection pointers,
   256K/256K/128K indirection arrays, minimum 640K overhead
5) mem_map sections with 1024 entries / 56KB for each section,
   3 indirection levels, 2^14/2^14/2^14 indirection pointers,
   128K/128K/128K indirection arrays, minimum 384KB overhead

2 levels of indirection results in large overhead in regard to memory.
For 3 levels of indirection the memory overhead is ok, but each lookup
has to walk 3 indirections. This adds cpu cycles to access the mem_map
array.

The alternative of a flat mem_map array in vmalloc space is much more
attractive. The size of the array is 2^52*56 Byte. 1,3% of the virtual
address space. The access doesn't change, an array gets accessed. The
access gets automatically cached by the hardware.
Simple, straightforward, no additional overhead. Only the setup of the
kernel page tables for the mem_map vmalloc area needs some thought.

-- 
blue skies,
  Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.



  reply	other threads:[~2006-08-10 12:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <m1irl4ftya.fsf@ebiederm.dsl.xmission.com>
     [not found] ` <20060807194159.f7c741b5.akpm@osdl.org>
     [not found]   ` <17624.7310.856480.704542@cargo.ozlabs.ibm.com>
2006-08-08  5:14     ` [PATCH] x86_64: Make NR_IRQS configurable in Kconfig Andi Kleen
2006-08-08  8:17       ` Martin Schwidefsky
2006-08-09 17:58         ` Luck, Tony
2006-08-09 18:25           ` Dave Hansen
2006-08-10 12:55             ` Martin Schwidefsky [this message]
2006-08-10 14:40               ` Andy Whitcroft
2006-08-10 14:53                 ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1155214538.14749.54.camel@localhost \
    --to=schwidefsky@de.ibm.com \
    --cc=Natalie.Protasevich@unisys.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=apw@shadowen.org \
    --cc=ebiederm@xmission.com \
    --cc=haveblue@us.ibm.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=rdunlap@xenotime.net \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox