public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Jack Steiner <steiner@sgi.com>
To: linux-ia64@vger.kernel.org
Subject: Re: [patch 0/4] ia64 SPARSEMEM
Date: Thu, 26 May 2005 21:44:53 +0000	[thread overview]
Message-ID: <20050526214453.GA20816@sgi.com> (raw)
In-Reply-To: <20050523175031.GC2783@localhost.localdomain>

On Thu, May 26, 2005 at 04:54:08PM -0400, Bob Picco wrote:
> luck wrote:	[Wed May 25 2005, 08:32:54PM EDT]
> > 
> > >+#ifdef CONFIG_SPARSEMEM
> > >+ /*
> > >+ * SECTION_SIZE_BITS            2^N: how big each section will be
> > >+ * MAX_PHYSADDR_BITS            2^N: how much physical address space we have
> > >+ * MAX_PHYSMEM_BITS             2^N: how much memory we can have in that space
> > >+ */
> > 
> > MAX_PHYSADDR_BITS is apparently never used ... what's the distinction
> Ah MAX_PHYSADDR_BITS appears not used by all arches ported to SPARSEMEM.  I 
> wonder if it's a remnant of NONLINEAR.  Dave, do you recall?
> > between it and MAX_PHYSMEM_BITS?  From the comments, I'd guess that you
> > really meant to use MAX_PHYSADDR_BITS in this:
> > 
> > #define SECTIONS_SHIFT          (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
> > 
> > Pursuing Jack Steiner's line of questioning on how this works for
> > the SGI Altix ... it would appear that he will need to use 50 for
> > MAX_PHYSMEM_BITS, and probably 32 for SECTION_SIZE_BITS (but maybe
> I went back and reviewed Jack's email.  I must be blind but don't see why
> he would need more than 44 bits of physical memory bits.  I agree that
> should you need 50 bits for physical address bits then you should use
> 32 bits for SECTION_SIZE_BITS.

Ahhhh. You folks are a step ahead of me. I was just in the process of trying
to figure out the various options.

We definitely need 50 bit physical addresses (49 on todays hardware but more
coming).

A physical address on Altix looks like:

	+-------------+--+--------------------+
	|  NODE #     |AS|       NodeOffset   |
	+-------------+--+--------------------+
	 4           3 33 3                  0
	 9           8 76 5                  0
	
		Bits [48:38] contain a node number in the range 0..2047
			(another bit will be added soon)
		Bits [37:36] always contain a "3" for WB RAM.
		Bits [35:0]  contain the node offset

Node numbers are not dense & do not start at 0. Large systems can
be partitioned into smaller chunks. Node numbers within a partition
are typically not interleaved with the node numbers of other partitions, but
it is possible to have a partition with almost any subset of node numbers.
For example, a partition could consist of nodes 1536, 1538, & 1552.


> > a smaller number ... his banks of memory all start on 4G boundaries,

All banks (currently) start on 16GB boundaries. I don't think it
matters, but directory memory occupies the last 1/32 of each DIMM. This
means that memory blocks are slightly smaller than you might expect. The
bios marks the directory memory as "unavailable".


> > but could be as small as 1G ... can you have a chunk with an empty
> > tail?).  So SGI will end up with 2^(50-32) = 256K entries in mem_section[]
> > (or perhaps 4x that if sections must be fully populated).  All allocated
> > on the boot node ... and perhaps consuming a significant portion of
> > the kernel memory mapped by dtr[0].
> Well worse case it would consume 2^(1(50-32)+3) (2 Mb).  I would hope that 
> it's not configured for 28 SECTION_SIZE_BITS and 50 physical. This would
> be excessive 2^((50-28)+3 = 32Mb and not advised.
> > 
> > 
> > It will be interesting to see performance numbers on how this compares
> > with against VIRTUAL_MEM_MAP ... trading cache misses vs. TLB misses.

I just finished buildind & booting a SPARSEMEM kernel. No problems but I have 
not run any performance tests yet. 

I had MAX_PHYSMEM_BITS set to the wrong value. I was on a small
system so it did not cause problems. I'll fix the size before running 
performance tests.

I noticed that available memory seems slightly smaller but have not tracked down the
cause.
 BASELINE
 Nid  MemTotal   MemFree   MemUsed      (in kB)
   0   3820304   3510272    310032
   1   3882992   3800224     82768
   2   3883008   3794352     88656
   3   3882992   3801552     81440
   4   3883008   3802272     80736

 SPARSE
 Nid  MemTotal   MemFree   MemUsed      (in kB)
   0   3820256   3320784    499472
   1   3882992   3741328    141664
   2   3883008   3749536    133472
   3   3882992   3751392    131600
   4   3883008   3758368    124640
	


> > 
> > -Tony
> bob
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Thanks

Jack Steiner (steiner@sgi.com)          651-683-5302
Principal Engineer                      SGI - Silicon Graphics, Inc.



  parent reply	other threads:[~2005-05-26 21:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-23 17:50 [patch 0/4] ia64 SPARSEMEM Bob Picco
2005-05-24  3:29 ` David Mosberger
2005-05-24 14:33 ` Bob Picco
2005-05-24 16:27 ` Bob Picco
2005-05-26  0:32 ` Luck, Tony
2005-05-26 20:09 ` David Mosberger
2005-05-26 20:54 ` Bob Picco
2005-05-26 21:02 ` Dave Hansen
2005-05-26 21:34 ` Luck, Tony
2005-05-26 21:44 ` Jack Steiner [this message]
2005-05-26 21:51 ` Bob Picco
2005-05-26 22:03 ` Luck, Tony
2005-05-26 22:04 ` Bob Picco
2005-05-27  5:14 ` Yasunori Goto
2005-05-27 10:35 ` Bob Picco
2005-05-27 16:23 ` David Mosberger
2005-05-27 22:04 ` Jack Steiner
2005-05-30  0:18 ` KAMEZAWA Hiroyuki
2005-05-31 17:55 ` Luck, Tony
2005-05-31 18:14 ` Dave Hansen
2005-05-31 18:15 ` Jack Steiner
2005-05-31 21:41 ` Luck, Tony
2005-05-31 21:58 ` Dave Hansen
2005-06-01  1:37 ` Bob Picco
2005-06-01  9:14 ` Andy Whitcroft
2005-06-01 22:48 ` David Mosberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050526214453.GA20816@sgi.com \
    --to=steiner@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox