* Very large memory configurations: > 16 TB
@ 2011-01-06 17:09 Jack Steiner
2011-01-07 12:16 ` Michel Lespinasse
2011-01-07 12:51 ` Ingo Molnar
0 siblings, 2 replies; 6+ messages in thread
From: Jack Steiner @ 2011-01-06 17:09 UTC (permalink / raw)
To: linux-mm, mingo; +Cc: linux-ia64
SGI is currently developing an x86_64 system with more than 16TB of memory per
SSI. As far as I can tell, this should be supported. The relevant definitions
such as MAX_PHYSMEM_BITS appear ok.
One area of concern is page counts. Exceeding 16TB will also exceed MAX_INT
page frames. The kernel (at least in all places I've found) keep pagecounts
in longs.
Have I missed anything? Should this > 16TB work? Are there any kernel problems or
problems with user tools that anyone knows of.
Any help or pointers to potential problem areas would be appreciated...
---
Jack Steiner (steiner@sgi.com)
SGI - Silicon Graphics, Inc.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Very large memory configurations: > 16 TB
2011-01-06 17:09 Very large memory configurations: > 16 TB Jack Steiner
@ 2011-01-07 12:16 ` Michel Lespinasse
2011-01-07 12:51 ` Ingo Molnar
1 sibling, 0 replies; 6+ messages in thread
From: Michel Lespinasse @ 2011-01-07 12:16 UTC (permalink / raw)
To: Jack Steiner; +Cc: linux-mm, mingo, linux-ia64
On Thu, Jan 6, 2011 at 9:09 AM, Jack Steiner <steiner@sgi.com> wrote:
> SGI is currently developing an x86_64 system with more than 16TB of memory per
> SSI. As far as I can tell, this should be supported. The relevant definitions
> such as MAX_PHYSMEM_BITS appear ok.
>
> One area of concern is page counts. Exceeding 16TB will also exceed MAX_INT
> page frames. The kernel (at least in all places I've found) keep pagecounts
> in longs.
>
> Have I missed anything? Should this > 16TB work? Are there any kernel problems or
> problems with user tools that anyone knows of.
>
> Any help or pointers to potential problem areas would be appreciated...
I don't know of any place that uses ints to count physical pages.
However, the page_referenced functions in mm/rmap.c return reference
counts as an integer. I believe a wraparound would only mislead the
LRU algorithms, but I haven't thought about it much. (Not sure why we
return a count anyway, since I believe callers only want to compare it
against zero ???)
--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Very large memory configurations: > 16 TB
2011-01-06 17:09 Very large memory configurations: > 16 TB Jack Steiner
2011-01-07 12:16 ` Michel Lespinasse
@ 2011-01-07 12:51 ` Ingo Molnar
2011-01-07 16:31 ` Christoph Lameter
1 sibling, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2011-01-07 12:51 UTC (permalink / raw)
To: Jack Steiner; +Cc: linux-mm, linux-ia64
* Jack Steiner <steiner@sgi.com> wrote:
> SGI is currently developing an x86_64 system with more than 16TB of memory per
> SSI. As far as I can tell, this should be supported. The relevant definitions such
> as MAX_PHYSMEM_BITS appear ok.
>
>
> One area of concern is page counts. Exceeding 16TB will also exceed MAX_INT page
> frames. The kernel (at least in all places I've found) keep pagecounts in longs.
>
> Have I missed anything? Should this > 16TB work? Are there any kernel problems or
> problems with user tools that anyone knows of.
>
> Any help or pointers to potential problem areas would be appreciated...
See this older 2008 mail i wrote about our current x86 64-bit limits:
http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00292.html
In that mail i outlined the various limits and the methods that it would take to
increase those limits, in order of difficulty. It appears we can probably go up to
32 TB relatively easily and up to 64 TB realistically - 128 TB theoretically.
Note that obviously there can be a number of unknown problems rise up, so you should
try to simulate a ton of RAM ASAP, before building the hardware ;-) (We could even
try to add a "memory size debug" feature to the kernel which would inject huge
'fake' blocks of RAM that the kernel will pretend to have but will skip in the buddy
allocator or so.
Beyond 64 TB it probably gets painful, very painful - a hardware extension to the
pagetable and canonical virtual memory space is the pragmatic solution there.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Very large memory configurations: > 16 TB
2011-01-07 12:51 ` Ingo Molnar
@ 2011-01-07 16:31 ` Christoph Lameter
2011-01-07 16:56 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Lameter @ 2011-01-07 16:31 UTC (permalink / raw)
To: Jack Steiner; +Cc: Ingo Molnar, linux-mm, linux-ia64
Andi put a description of the memory layout in
Documentation/x86/x86_64/mm.txt. Seems to indicate that 64 TB was
considered as a maximum when the memory layout for x86_64 was set up:
------
Virtual memory map with 4 level page tables:
0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0
ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space
The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory
holes).
vmalloc space is lazily synchronized into the different PML4 pages of
the processes using the page fault handler, with init_level4_pgt as
reference.
Current X86-64 implementations only support 40 bits of address space,
but we support up to 46 bits. This expands into MBZ space in the page
tables.
-Andi Kleen, Jul 2004
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Very large memory configurations: > 16 TB
2011-01-07 16:31 ` Christoph Lameter
@ 2011-01-07 16:56 ` Ingo Molnar
2011-01-07 17:14 ` Christoph Lameter
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2011-01-07 16:56 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Jack Steiner, linux-mm, linux-ia64
* Christoph Lameter <cl@linux.com> wrote:
> Andi put a description of the memory layout in
> Documentation/x86/x86_64/mm.txt. Seems to indicate that 64 TB was
> considered as a maximum when the memory layout for x86_64 was set up:
Yes, that document was rather incomplete and does not really answer Jack's
questions, that's why i sent this more complete description originally:
http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00292.html
a few years ago, answering a similar question.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Very large memory configurations: > 16 TB
2011-01-07 16:56 ` Ingo Molnar
@ 2011-01-07 17:14 ` Christoph Lameter
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter @ 2011-01-07 17:14 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jack Steiner, linux-mm, linux-ia64
On Fri, 7 Jan 2011, Ingo Molnar wrote:
> Yes, that document was rather incomplete and does not really answer Jack's
> questions, that's why i sent this more complete description originally:
>
> http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00292.html
>
> a few years ago, answering a similar question.
Please update the information in that file then.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-01-07 17:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-06 17:09 Very large memory configurations: > 16 TB Jack Steiner
2011-01-07 12:16 ` Michel Lespinasse
2011-01-07 12:51 ` Ingo Molnar
2011-01-07 16:31 ` Christoph Lameter
2011-01-07 16:56 ` Ingo Molnar
2011-01-07 17:14 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).