linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: 46 bit PAE support
@ 2009-05-05 21:28 Rik van Riel
  2009-05-06  1:53 ` H. Peter Anvin
  0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2009-05-05 21:28 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org; +Cc: linux-mm, mingo, akpm

Extend the maximum addressable memory on x86-64 from 2^44 to
2^46 bytes. This requires some shuffling around of the vmalloc
and virtual memmap memory areas, to keep them away from the
direct mapping of up to 64TB of physical memory.

This patch also introduces a guard hole between the vmalloc
area and the virtual memory map space.  There's really no
good reason why we wouldn't have a guard hole there.

Signed-off-by: Rik van Riel <riel@redhat.com>

---
Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
I had access to a system with 64TB of RAM? :)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 29b52b1..5394132 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -6,10 +6,11 @@ Virtual memory map with 4 level page tables:
 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
 hole caused by [48:63] sign extension
 ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
-ffff880000000000 - ffffc0ffffffffff (=57 TB) direct mapping of all phys. memory
-ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole
-ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space
-ffffe20000000000 - ffffe2ffffffffff (=40 bits) virtual memory map (1TB)
+ffff880000000000 - ffffc8ffffffffff (=64 TB) direct mapping of all phys. memory
+ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
+ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
+ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
+ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
 ffffffff80000000 - ffffffffa0000000 (=512 MB)  kernel text mapping, from phys 0
 ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index d38c91b..786306c 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -47,7 +47,7 @@
 #define __START_KERNEL		(__START_KERNEL_map + __PHYSICAL_START)
 #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
 
-/* See Documentation/x86_64/mm.txt for a description of the memory map. */
+/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
 #define __PHYSICAL_MASK_SHIFT	46
 #define __VIRTUAL_MASK_SHIFT	48
 
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index fbf42b8..766ea16 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -51,11 +51,11 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_SIZE	(_AC(1, UL) << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
-
+/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
 #define MAXMEM		 _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
-#define VMALLOC_START    _AC(0xffffc20000000000, UL)
-#define VMALLOC_END      _AC(0xffffe1ffffffffff, UL)
-#define VMEMMAP_START	 _AC(0xffffe20000000000, UL)
+#define VMALLOC_START    _AC(0xffffc90000000000, UL)
+#define VMALLOC_END      _AC(0xffffe8ffffffffff, UL)
+#define VMEMMAP_START	 _AC(0xffffea0000000000, UL)
 #define MODULES_VADDR    _AC(0xffffffffa0000000, UL)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index e3cc3c0..4517d6b 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,7 +27,7 @@
 #else /* CONFIG_X86_32 */
 # define SECTION_SIZE_BITS	27 /* matt - 128 is convenient right now */
 # define MAX_PHYSADDR_BITS	44
-# define MAX_PHYSMEM_BITS	44 /* Can be max 45 bits */
+# define MAX_PHYSMEM_BITS	46
 #endif
 
 #endif /* CONFIG_SPARSEMEM */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-05 21:28 [PATCH] x86: 46 bit PAE support Rik van Riel
@ 2009-05-06  1:53 ` H. Peter Anvin
  2009-05-06 12:20   ` Rik van Riel
  0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2009-05-06  1:53 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel@vger.kernel.org, linux-mm, mingo, akpm

Rik van Riel wrote:
> Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
> I had access to a system with 64TB of RAM? :)

No, but it would be good if we could test it under Qemu or KVM with an
appropriately set up sparse memory map.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-06  1:53 ` H. Peter Anvin
@ 2009-05-06 12:20   ` Rik van Riel
  2009-05-06 12:30     ` Ingo Molnar
  2009-05-07 12:01     ` Pavel Machek
  0 siblings, 2 replies; 8+ messages in thread
From: Rik van Riel @ 2009-05-06 12:20 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel@vger.kernel.org, linux-mm, mingo, akpm

H. Peter Anvin wrote:
> Rik van Riel wrote:
>> Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
>> I had access to a system with 64TB of RAM? :)
> 
> No, but it would be good if we could test it under Qemu or KVM with an
> appropriately set up sparse memory map.

I don't have a system with 1TB either, which is how much space
the memmap[] would take...

-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-06 12:20   ` Rik van Riel
@ 2009-05-06 12:30     ` Ingo Molnar
  2009-05-07 12:01     ` Pavel Machek
  1 sibling, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2009-05-06 12:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: H. Peter Anvin, linux-kernel@vger.kernel.org, linux-mm, mingo,
	akpm


* Rik van Riel <riel@redhat.com> wrote:

> H. Peter Anvin wrote:
>> Rik van Riel wrote:
>>> Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
>>> I had access to a system with 64TB of RAM? :)
>>
>> No, but it would be good if we could test it under Qemu or KVM with an
>> appropriately set up sparse memory map.
>
> I don't have a system with 1TB either, which is how much space
> the memmap[] would take...

Not if the physical layout is sparse. I.e. something silly like:

  BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
  BIOS-e820: 0000200000000000 - 0000200040000000 (usable)

Which is 1GB of RAM at 4GB physical offset, and another 1GB of RAM 
at 32 TB physical offset. Takes two gigs of real RAM and a kernel 
modified with your patch, to not get confused by this :-)

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-06 12:20   ` Rik van Riel
  2009-05-06 12:30     ` Ingo Molnar
@ 2009-05-07 12:01     ` Pavel Machek
  2009-05-07 14:16       ` Ingo Molnar
  1 sibling, 1 reply; 8+ messages in thread
From: Pavel Machek @ 2009-05-07 12:01 UTC (permalink / raw)
  To: Rik van Riel
  Cc: H. Peter Anvin, linux-kernel@vger.kernel.org, linux-mm, mingo,
	akpm

On Wed 2009-05-06 08:20:59, Rik van Riel wrote:
> H. Peter Anvin wrote:
>> Rik van Riel wrote:
>>> Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
>>> I had access to a system with 64TB of RAM? :)
>>
>> No, but it would be good if we could test it under Qemu or KVM with an
>> appropriately set up sparse memory map.
>
> I don't have a system with 1TB either, which is how much space
> the memmap[] would take...

Do we really have 1 byte overhead per 64 bytes of RAM?
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-07 12:01     ` Pavel Machek
@ 2009-05-07 14:16       ` Ingo Molnar
  2009-05-07 14:27         ` H. Peter Anvin
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2009-05-07 14:16 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rik van Riel, H. Peter Anvin, linux-kernel@vger.kernel.org,
	linux-mm, mingo, akpm


* Pavel Machek <pavel@ucw.cz> wrote:

> On Wed 2009-05-06 08:20:59, Rik van Riel wrote:
> > H. Peter Anvin wrote:
> >> Rik van Riel wrote:
> >>> Testing: booted it on an x86-64 system with 6GB RAM.  Did you really think
> >>> I had access to a system with 64TB of RAM? :)
> >>
> >> No, but it would be good if we could test it under Qemu or KVM with an
> >> appropriately set up sparse memory map.
> >
> > I don't have a system with 1TB either, which is how much space
> > the memmap[] would take...
> 
> Do we really have 1 byte overhead per 64 bytes of RAM?
> 								Pavel

Yes, struct page is ~64 bytes, and 64*64 == 4096.

Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of 
RAM. My suggestion was to create a sparse physical memory map (in a 
virtual machine) that spreads ~1GB of RAM all around the 64 TB 
physical address space. That will test whether the kernel is able to 
map and work with such physical addresses. (which will cover most of 
the issues)

A good look at /debug/x86/dump_pagetables with such a system booted 
up would be nice as well - to make sure every virtual memory range 
is in its proper area, and that there's enough free space around 
them.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-07 14:16       ` Ingo Molnar
@ 2009-05-07 14:27         ` H. Peter Anvin
  2009-05-07 14:49           ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2009-05-07 14:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Pavel Machek, Rik van Riel, linux-kernel@vger.kernel.org,
	linux-mm, mingo, akpm

Ingo Molnar wrote:
> 
> Yes, struct page is ~64 bytes, and 64*64 == 4096.
> 
> Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of 
> RAM. My suggestion was to create a sparse physical memory map (in a 
> virtual machine) that spreads ~1GB of RAM all around the 64 TB 
> physical address space. That will test whether the kernel is able to 
> map and work with such physical addresses. (which will cover most of 
> the issues)
> 
> A good look at /debug/x86/dump_pagetables with such a system booted 
> up would be nice as well - to make sure every virtual memory range 
> is in its proper area, and that there's enough free space around 
> them.
> 

We're working on simulating this at Intel.  We should hopefully be able
to test this next week.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86: 46 bit PAE support
  2009-05-07 14:27         ` H. Peter Anvin
@ 2009-05-07 14:49           ` Ingo Molnar
  0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2009-05-07 14:49 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Pavel Machek, Rik van Riel, linux-kernel@vger.kernel.org,
	linux-mm, mingo, akpm


* H. Peter Anvin <hpa@zytor.com> wrote:

> Ingo Molnar wrote:
> > 
> > Yes, struct page is ~64 bytes, and 64*64 == 4096.
> > 
> > Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of 
> > RAM. My suggestion was to create a sparse physical memory map (in a 
> > virtual machine) that spreads ~1GB of RAM all around the 64 TB 
> > physical address space. That will test whether the kernel is able to 
> > map and work with such physical addresses. (which will cover most of 
> > the issues)
> > 
> > A good look at /debug/x86/dump_pagetables with such a system booted 
> > up would be nice as well - to make sure every virtual memory range 
> > is in its proper area, and that there's enough free space around 
> > them.
> > 
> 
> We're working on simulating this at Intel.  We should hopefully be 
> able to test this next week.

Wow, very nice!

It would be nice to do it on a KVM basis and submit the 
weird-memory-layout submission to the KVM tree. It would be helpful 
with the reproduction of weird, memory layout dependent bugs too for 
example. Plus we could create a test facility that randomizes the 
physical memory layout (with a given fragmentation level).

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-05-07 14:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-05 21:28 [PATCH] x86: 46 bit PAE support Rik van Riel
2009-05-06  1:53 ` H. Peter Anvin
2009-05-06 12:20   ` Rik van Riel
2009-05-06 12:30     ` Ingo Molnar
2009-05-07 12:01     ` Pavel Machek
2009-05-07 14:16       ` Ingo Molnar
2009-05-07 14:27         ` H. Peter Anvin
2009-05-07 14:49           ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).