* [PATCH] x86: 46 bit PAE support
@ 2009-05-05 21:28 Rik van Riel
2009-05-06 1:53 ` H. Peter Anvin
0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2009-05-05 21:28 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org; +Cc: linux-mm, mingo, akpm
Extend the maximum addressable memory on x86-64 from 2^44 to
2^46 bytes. This requires some shuffling around of the vmalloc
and virtual memmap memory areas, to keep them away from the
direct mapping of up to 64TB of physical memory.
This patch also introduces a guard hole between the vmalloc
area and the virtual memory map space. There's really no
good reason why we wouldn't have a guard hole there.
Signed-off-by: Rik van Riel <riel@redhat.com>
---
Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
I had access to a system with 64TB of RAM? :)
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 29b52b1..5394132 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -6,10 +6,11 @@ Virtual memory map with 4 level page tables:
0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
-ffff880000000000 - ffffc0ffffffffff (=57 TB) direct mapping of all phys. memory
-ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole
-ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space
-ffffe20000000000 - ffffe2ffffffffff (=40 bits) virtual memory map (1TB)
+ffff880000000000 - ffffc8ffffffffff (=64 TB) direct mapping of all phys. memory
+ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
+ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
+ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
+ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0
ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index d38c91b..786306c 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -47,7 +47,7 @@
#define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START)
#define __START_KERNEL_map _AC(0xffffffff80000000, UL)
-/* See Documentation/x86_64/mm.txt for a description of the memory map. */
+/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
#define __PHYSICAL_MASK_SHIFT 46
#define __VIRTUAL_MASK_SHIFT 48
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index fbf42b8..766ea16 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -51,11 +51,11 @@ typedef struct { pteval_t pte; } pte_t;
#define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE - 1))
-
+/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
-#define VMALLOC_START _AC(0xffffc20000000000, UL)
-#define VMALLOC_END _AC(0xffffe1ffffffffff, UL)
-#define VMEMMAP_START _AC(0xffffe20000000000, UL)
+#define VMALLOC_START _AC(0xffffc90000000000, UL)
+#define VMALLOC_END _AC(0xffffe8ffffffffff, UL)
+#define VMEMMAP_START _AC(0xffffea0000000000, UL)
#define MODULES_VADDR _AC(0xffffffffa0000000, UL)
#define MODULES_END _AC(0xffffffffff000000, UL)
#define MODULES_LEN (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index e3cc3c0..4517d6b 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,7 +27,7 @@
#else /* CONFIG_X86_32 */
# define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */
# define MAX_PHYSADDR_BITS 44
-# define MAX_PHYSMEM_BITS 44 /* Can be max 45 bits */
+# define MAX_PHYSMEM_BITS 46
#endif
#endif /* CONFIG_SPARSEMEM */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-05 21:28 [PATCH] x86: 46 bit PAE support Rik van Riel
@ 2009-05-06 1:53 ` H. Peter Anvin
2009-05-06 12:20 ` Rik van Riel
0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2009-05-06 1:53 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-kernel@vger.kernel.org, linux-mm, mingo, akpm
Rik van Riel wrote:
> Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
> I had access to a system with 64TB of RAM? :)
No, but it would be good if we could test it under Qemu or KVM with an
appropriately set up sparse memory map.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-06 1:53 ` H. Peter Anvin
@ 2009-05-06 12:20 ` Rik van Riel
2009-05-06 12:30 ` Ingo Molnar
2009-05-07 12:01 ` Pavel Machek
0 siblings, 2 replies; 8+ messages in thread
From: Rik van Riel @ 2009-05-06 12:20 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel@vger.kernel.org, linux-mm, mingo, akpm
H. Peter Anvin wrote:
> Rik van Riel wrote:
>> Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
>> I had access to a system with 64TB of RAM? :)
>
> No, but it would be good if we could test it under Qemu or KVM with an
> appropriately set up sparse memory map.
I don't have a system with 1TB either, which is how much space
the memmap[] would take...
--
All rights reversed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-06 12:20 ` Rik van Riel
@ 2009-05-06 12:30 ` Ingo Molnar
2009-05-07 12:01 ` Pavel Machek
1 sibling, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2009-05-06 12:30 UTC (permalink / raw)
To: Rik van Riel
Cc: H. Peter Anvin, linux-kernel@vger.kernel.org, linux-mm, mingo,
akpm
* Rik van Riel <riel@redhat.com> wrote:
> H. Peter Anvin wrote:
>> Rik van Riel wrote:
>>> Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
>>> I had access to a system with 64TB of RAM? :)
>>
>> No, but it would be good if we could test it under Qemu or KVM with an
>> appropriately set up sparse memory map.
>
> I don't have a system with 1TB either, which is how much space
> the memmap[] would take...
Not if the physical layout is sparse. I.e. something silly like:
BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
BIOS-e820: 0000200000000000 - 0000200040000000 (usable)
Which is 1GB of RAM at 4GB physical offset, and another 1GB of RAM
at 32 TB physical offset. Takes two gigs of real RAM and a kernel
modified with your patch, to not get confused by this :-)
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-06 12:20 ` Rik van Riel
2009-05-06 12:30 ` Ingo Molnar
@ 2009-05-07 12:01 ` Pavel Machek
2009-05-07 14:16 ` Ingo Molnar
1 sibling, 1 reply; 8+ messages in thread
From: Pavel Machek @ 2009-05-07 12:01 UTC (permalink / raw)
To: Rik van Riel
Cc: H. Peter Anvin, linux-kernel@vger.kernel.org, linux-mm, mingo,
akpm
On Wed 2009-05-06 08:20:59, Rik van Riel wrote:
> H. Peter Anvin wrote:
>> Rik van Riel wrote:
>>> Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
>>> I had access to a system with 64TB of RAM? :)
>>
>> No, but it would be good if we could test it under Qemu or KVM with an
>> appropriately set up sparse memory map.
>
> I don't have a system with 1TB either, which is how much space
> the memmap[] would take...
Do we really have 1 byte overhead per 64 bytes of RAM?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-07 12:01 ` Pavel Machek
@ 2009-05-07 14:16 ` Ingo Molnar
2009-05-07 14:27 ` H. Peter Anvin
0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2009-05-07 14:16 UTC (permalink / raw)
To: Pavel Machek
Cc: Rik van Riel, H. Peter Anvin, linux-kernel@vger.kernel.org,
linux-mm, mingo, akpm
* Pavel Machek <pavel@ucw.cz> wrote:
> On Wed 2009-05-06 08:20:59, Rik van Riel wrote:
> > H. Peter Anvin wrote:
> >> Rik van Riel wrote:
> >>> Testing: booted it on an x86-64 system with 6GB RAM. Did you really think
> >>> I had access to a system with 64TB of RAM? :)
> >>
> >> No, but it would be good if we could test it under Qemu or KVM with an
> >> appropriately set up sparse memory map.
> >
> > I don't have a system with 1TB either, which is how much space
> > the memmap[] would take...
>
> Do we really have 1 byte overhead per 64 bytes of RAM?
> Pavel
Yes, struct page is ~64 bytes, and 64*64 == 4096.
Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of
RAM. My suggestion was to create a sparse physical memory map (in a
virtual machine) that spreads ~1GB of RAM all around the 64 TB
physical address space. That will test whether the kernel is able to
map and work with such physical addresses. (which will cover most of
the issues)
A good look at /debug/x86/dump_pagetables with such a system booted
up would be nice as well - to make sure every virtual memory range
is in its proper area, and that there's enough free space around
them.
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-07 14:16 ` Ingo Molnar
@ 2009-05-07 14:27 ` H. Peter Anvin
2009-05-07 14:49 ` Ingo Molnar
0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2009-05-07 14:27 UTC (permalink / raw)
To: Ingo Molnar
Cc: Pavel Machek, Rik van Riel, linux-kernel@vger.kernel.org,
linux-mm, mingo, akpm
Ingo Molnar wrote:
>
> Yes, struct page is ~64 bytes, and 64*64 == 4096.
>
> Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of
> RAM. My suggestion was to create a sparse physical memory map (in a
> virtual machine) that spreads ~1GB of RAM all around the 64 TB
> physical address space. That will test whether the kernel is able to
> map and work with such physical addresses. (which will cover most of
> the issues)
>
> A good look at /debug/x86/dump_pagetables with such a system booted
> up would be nice as well - to make sure every virtual memory range
> is in its proper area, and that there's enough free space around
> them.
>
We're working on simulating this at Intel. We should hopefully be able
to test this next week.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86: 46 bit PAE support
2009-05-07 14:27 ` H. Peter Anvin
@ 2009-05-07 14:49 ` Ingo Molnar
0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2009-05-07 14:49 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Pavel Machek, Rik van Riel, linux-kernel@vger.kernel.org,
linux-mm, mingo, akpm
* H. Peter Anvin <hpa@zytor.com> wrote:
> Ingo Molnar wrote:
> >
> > Yes, struct page is ~64 bytes, and 64*64 == 4096.
> >
> > Alas, it's not a problem: my suggestion wasnt to simulate 64 TB of
> > RAM. My suggestion was to create a sparse physical memory map (in a
> > virtual machine) that spreads ~1GB of RAM all around the 64 TB
> > physical address space. That will test whether the kernel is able to
> > map and work with such physical addresses. (which will cover most of
> > the issues)
> >
> > A good look at /debug/x86/dump_pagetables with such a system booted
> > up would be nice as well - to make sure every virtual memory range
> > is in its proper area, and that there's enough free space around
> > them.
> >
>
> We're working on simulating this at Intel. We should hopefully be
> able to test this next week.
Wow, very nice!
It would be nice to do it on a KVM basis and submit the
weird-memory-layout submission to the KVM tree. It would be helpful
with the reproduction of weird, memory layout dependent bugs too for
example. Plus we could create a test facility that randomizes the
physical memory layout (with a given fragmentation level).
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-05-07 14:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-05 21:28 [PATCH] x86: 46 bit PAE support Rik van Riel
2009-05-06 1:53 ` H. Peter Anvin
2009-05-06 12:20 ` Rik van Riel
2009-05-06 12:30 ` Ingo Molnar
2009-05-07 12:01 ` Pavel Machek
2009-05-07 14:16 ` Ingo Molnar
2009-05-07 14:27 ` H. Peter Anvin
2009-05-07 14:49 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).