public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* x86 question: Can a process have > 3GB memory?
@ 2002-05-07 23:03 Clifford White
  2002-05-07 23:08 ` Robert Love
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Clifford White @ 2002-05-07 23:03 UTC (permalink / raw)
  To: linux-kernel


We are working with a database that requires a large amount of memory
allocated by a single process.
This is on an Intel 32-bit platform.
We'd like to go > 3GB of memory per process.
Is this possible on a 32-bit machine? I have been reading the various
'highmem' discussions, but that's kernel page tables...
Or is this a glibc issue, and not proper for a kernel-list question?
Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
Arch) page states that it's possible, but.....how?

cliffw
NUMA-Q
Technical Guy
1-503-578-4306



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:03 x86 question: Can a process have > 3GB memory? Clifford White
@ 2002-05-07 23:08 ` Robert Love
  2002-05-08  5:33   ` Martin J. Bligh
  2002-05-08  8:29   ` Andrea Arcangeli
  2002-05-07 23:33 ` Alan Cox
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Robert Love @ 2002-05-07 23:08 UTC (permalink / raw)
  To: Clifford White; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

On Tue, 2002-05-07 at 16:03, Clifford White wrote:

> We are working with a database that requires a large amount of memory
> allocated by a single process.
> This is on an Intel 32-bit platform.
> We'd like to go > 3GB of memory per process.
> Is this possible on a 32-bit machine? I have been reading the various
> 'highmem' discussions, but that's kernel page tables...
> Or is this a glibc issue, and not proper for a kernel-list question?
> Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
> Arch) page states that it's possible, but.....how?

You can go to 3.5GB, anything more and stuff starts getting real tight
and not very nice.  You can only do 3.5/0.5 on non-PAE, though - PAE
requires segments to be aligned on 1GB-boundaries.

The attached patch (for which credit goes elsewhere - Ingo or Randy, I
think?) implements the full range of 1 to 3.5GB user space partitioning,
selectable at compile-time.

	Robert Love


[-- Attachment #2: 00_3.5G-address-space-4.patch --]
[-- Type: text/x-patch, Size: 9478 bytes --]

diff -urN 2.4.18pre7/Rules.make 3g/Rules.make
--- 2.4.18pre7/Rules.make	Thu Jan 24 02:05:25 2002
+++ 3g/Rules.make	Mon Jan 28 05:55:28 2002
@@ -214,12 +214,29 @@
 #
 # Added the SMP separator to stop module accidents between uniprocessor
 # and SMP Intel boxes - AC - from bits by Michael Chastain
+# Added separator for different PAGE_OFFSET memory models - Ingo.
 #
 
 ifdef CONFIG_SMP
 	genksyms_smp_prefix := -p smp_
 else
 	genksyms_smp_prefix := 
+endif
+
+ifdef CONFIG_2GB
+ifdef CONFIG_SMP
+	genksyms_smp_prefix := -p smp_2gig_
+else
+	genksyms_smp_prefix := -p 2gig_
+endif
+endif
+
+ifdef CONFIG_3GB
+ifdef CONFIG_SMP
+	genksyms_smp_prefix := -p smp_3gig_
+else
+	genksyms_smp_prefix := -p 3gig_
+endif
 endif
 
 $(MODINCL)/%.ver: %.c
diff -urN 2.4.18pre7/arch/i386/Makefile 3g/arch/i386/Makefile
--- 2.4.18pre7/arch/i386/Makefile	Tue May  1 19:35:18 2001
+++ 3g/arch/i386/Makefile	Mon Jan 28 05:55:28 2002
@@ -106,6 +106,9 @@
 
 MAKEBOOT = $(MAKE) -C arch/$(ARCH)/boot
 
+arch/i386/vmlinux.lds: arch/i386/vmlinux.lds.S FORCE
+	$(CPP) -C -P -I$(HPATH) -imacros $(HPATH)/asm-i386/page_offset.h -Ui386 arch/i386/vmlinux.lds.S >arch/i386/vmlinux.lds
+
 vmlinux: arch/i386/vmlinux.lds
 
 FORCE: ;
@@ -142,6 +145,7 @@
 	@$(MAKEBOOT) clean
 
 archmrproper:
+	rm -f arch/i386/vmlinux.lds
 
 archdep:
 	@$(MAKEBOOT) dep
diff -urN 2.4.18pre7/arch/i386/config.in 3g/arch/i386/config.in
--- 2.4.18pre7/arch/i386/config.in	Thu Jan 24 02:05:26 2002
+++ 3g/arch/i386/config.in	Mon Jan 28 05:55:30 2002
@@ -171,12 +171,23 @@
 	"off    CONFIG_NOHIGHMEM \
 	 4GB    CONFIG_HIGHMEM4G \
 	 64GB   CONFIG_HIGHMEM64G" off
-if [ "$CONFIG_HIGHMEM4G" = "y" ]; then
+if [ "$CONFIG_HIGHMEM4G" = "y" -o "$CONFIG_HIGHMEM64G" = "y" ]; then
    define_bool CONFIG_HIGHMEM y
+else
+   define_bool CONFIG_HIGHMEM n
 fi
 if [ "$CONFIG_HIGHMEM64G" = "y" ]; then
-   define_bool CONFIG_HIGHMEM y
    define_bool CONFIG_X86_PAE y
+   choice 'User address space size' \
+	"3GB		CONFIG_1GB \
+	 2GB		CONFIG_2GB \
+	 1GB		CONFIG_3GB" 3GB
+else
+   choice 'User address space size' \
+	"3GB		CONFIG_1GB \
+	 2GB		CONFIG_2GB \
+	 1GB		CONFIG_3GB \
+	 3.5GB		CONFIG_05GB" 3GB
 fi
 
 bool 'Math emulation' CONFIG_MATH_EMULATION
diff -urN 2.4.18pre7/arch/i386/vmlinux.lds 3g/arch/i386/vmlinux.lds
--- 2.4.18pre7/arch/i386/vmlinux.lds	Thu Jan 24 02:05:26 2002
+++ 3g/arch/i386/vmlinux.lds	Thu Jan  1 01:00:00 1970
@@ -1,82 +0,0 @@
-/* ld script to make i386 Linux kernel
- * Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>;
- */
-OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
-OUTPUT_ARCH(i386)
-ENTRY(_start)
-SECTIONS
-{
-  . = 0xC0000000 + 0x100000;
-  _text = .;			/* Text and read-only data */
-  .text : {
-	*(.text)
-	*(.fixup)
-	*(.gnu.warning)
-	} = 0x9090
-
-  _etext = .;			/* End of text section */
-
-  .rodata : { *(.rodata) *(.rodata.*) }
-  .kstrtab : { *(.kstrtab) }
-
-  . = ALIGN(16);		/* Exception table */
-  __start___ex_table = .;
-  __ex_table : { *(__ex_table) }
-  __stop___ex_table = .;
-
-  __start___ksymtab = .;	/* Kernel symbol table */
-  __ksymtab : { *(__ksymtab) }
-  __stop___ksymtab = .;
-
-  .data : {			/* Data */
-	*(.data)
-	CONSTRUCTORS
-	}
-
-  _edata = .;			/* End of data section */
-
-  . = ALIGN(8192);		/* init_task */
-  .data.init_task : { *(.data.init_task) }
-
-  . = ALIGN(4096);		/* Init code and data */
-  __init_begin = .;
-  .text.init : { *(.text.init) }
-  .data.init : { *(.data.init) }
-  . = ALIGN(16);
-  __setup_start = .;
-  .setup.init : { *(.setup.init) }
-  __setup_end = .;
-  __initcall_start = .;
-  .initcall.init : { *(.initcall.init) }
-  __initcall_end = .;
-  . = ALIGN(4096);
-  __init_end = .;
-
-  . = ALIGN(4096);
-  .data.page_aligned : { *(.data.idt) }
-
-  . = ALIGN(32);
-  .data.cacheline_aligned : { *(.data.cacheline_aligned) }
-
-  __bss_start = .;		/* BSS */
-  .bss : {
-	*(.bss)
-	}
-  _end = . ;
-
-  /* Sections to be discarded */
-  /DISCARD/ : {
-	*(.text.exit)
-	*(.data.exit)
-	*(.exitcall.exit)
-	}
-
-  /* Stabs debugging sections.  */
-  .stab 0 : { *(.stab) }
-  .stabstr 0 : { *(.stabstr) }
-  .stab.excl 0 : { *(.stab.excl) }
-  .stab.exclstr 0 : { *(.stab.exclstr) }
-  .stab.index 0 : { *(.stab.index) }
-  .stab.indexstr 0 : { *(.stab.indexstr) }
-  .comment 0 : { *(.comment) }
-}
diff -urN 2.4.18pre7/arch/i386/vmlinux.lds.S 3g/arch/i386/vmlinux.lds.S
--- 2.4.18pre7/arch/i386/vmlinux.lds.S	Thu Jan  1 01:00:00 1970
+++ 3g/arch/i386/vmlinux.lds.S	Mon Jan 28 05:55:28 2002
@@ -0,0 +1,82 @@
+/* ld script to make i386 Linux kernel
+ * Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>;
+ */
+OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
+OUTPUT_ARCH(i386)
+ENTRY(_start)
+SECTIONS
+{
+  . = PAGE_OFFSET_RAW + 0x100000;
+  _text = .;			/* Text and read-only data */
+  .text : {
+	*(.text)
+	*(.fixup)
+	*(.gnu.warning)
+	} = 0x9090
+
+  _etext = .;			/* End of text section */
+
+  .rodata : { *(.rodata) *(.rodata.*) }
+  .kstrtab : { *(.kstrtab) }
+
+  . = ALIGN(16);		/* Exception table */
+  __start___ex_table = .;
+  __ex_table : { *(__ex_table) }
+  __stop___ex_table = .;
+
+  __start___ksymtab = .;	/* Kernel symbol table */
+  __ksymtab : { *(__ksymtab) }
+  __stop___ksymtab = .;
+
+  .data : {			/* Data */
+	*(.data)
+	CONSTRUCTORS
+	}
+
+  _edata = .;			/* End of data section */
+
+  . = ALIGN(8192);		/* init_task */
+  .data.init_task : { *(.data.init_task) }
+
+  . = ALIGN(4096);		/* Init code and data */
+  __init_begin = .;
+  .text.init : { *(.text.init) }
+  .data.init : { *(.data.init) }
+  . = ALIGN(16);
+  __setup_start = .;
+  .setup.init : { *(.setup.init) }
+  __setup_end = .;
+  __initcall_start = .;
+  .initcall.init : { *(.initcall.init) }
+  __initcall_end = .;
+  . = ALIGN(4096);
+  __init_end = .;
+
+  . = ALIGN(4096);
+  .data.page_aligned : { *(.data.idt) }
+
+  . = ALIGN(32);
+  .data.cacheline_aligned : { *(.data.cacheline_aligned) }
+
+  __bss_start = .;		/* BSS */
+  .bss : {
+	*(.bss)
+	}
+  _end = . ;
+
+  /* Sections to be discarded */
+  /DISCARD/ : {
+	*(.text.exit)
+	*(.data.exit)
+	*(.exitcall.exit)
+	}
+
+  /* Stabs debugging sections.  */
+  .stab 0 : { *(.stab) }
+  .stabstr 0 : { *(.stabstr) }
+  .stab.excl 0 : { *(.stab.excl) }
+  .stab.exclstr 0 : { *(.stab.exclstr) }
+  .stab.index 0 : { *(.stab.index) }
+  .stab.indexstr 0 : { *(.stab.indexstr) }
+  .comment 0 : { *(.comment) }
+}
diff -urN 2.4.18pre7/include/asm-i386/page.h 3g/include/asm-i386/page.h
--- 2.4.18pre7/include/asm-i386/page.h	Thu Jan 24 02:06:02 2002
+++ 3g/include/asm-i386/page.h	Mon Jan 28 05:55:28 2002
@@ -78,7 +78,9 @@
  * and CONFIG_HIGHMEM64G options in the kernel configuration.
  */
 
-#define __PAGE_OFFSET		(0xC0000000)
+#include <asm/page_offset.h>
+
+#define __PAGE_OFFSET		(PAGE_OFFSET_RAW)
 
 /*
  * This much address space is reserved for vmalloc() and iomap()
diff -urN 2.4.18pre7/include/asm-i386/page_offset.h 3g/include/asm-i386/page_offset.h
--- 2.4.18pre7/include/asm-i386/page_offset.h	Thu Jan  1 01:00:00 1970
+++ 3g/include/asm-i386/page_offset.h	Mon Jan 28 05:55:28 2002
@@ -0,0 +1,10 @@
+#include <linux/config.h>
+#ifdef CONFIG_05GB
+#define PAGE_OFFSET_RAW 0xE0000000
+#elif defined(CONFIG_1GB)
+#define PAGE_OFFSET_RAW 0xC0000000
+#elif defined(CONFIG_2GB)
+#define PAGE_OFFSET_RAW 0x80000000
+#elif defined(CONFIG_3GB)
+#define PAGE_OFFSET_RAW 0x40000000
+#endif
diff -urN 2.4.18pre7/include/asm-i386/processor.h 3g/include/asm-i386/processor.h
--- 2.4.18pre7/include/asm-i386/processor.h	Tue Jan 22 18:55:59 2002
+++ 3g/include/asm-i386/processor.h	Mon Jan 28 05:55:28 2002
@@ -270,7 +270,11 @@
 /* This decides where the kernel will search for a free chunk of vm
  * space during mmap's.
  */
+#ifndef CONFIG_05GB
 #define TASK_UNMAPPED_BASE	(TASK_SIZE / 3)
+#else
+#define TASK_UNMAPPED_BASE	(TASK_SIZE / 16)
+#endif
 
 /*
  * Size of io_bitmap in longwords: 32 is ports 0-0x3ff.
diff -urN 2.4.18pre7/mm/memory.c 3g/mm/memory.c
--- 2.4.18pre7/mm/memory.c	Tue Jan 22 18:56:30 2002
+++ 3g/mm/memory.c	Mon Jan 28 05:55:28 2002
@@ -106,8 +106,7 @@
 
 static inline void free_one_pgd(pgd_t * dir)
 {
-	int j;
-	pmd_t * pmd;
+	pmd_t * pmd, * md, * emd;
 
 	if (pgd_none(*dir))
 		return;
@@ -118,9 +117,23 @@
 	}
 	pmd = pmd_offset(dir, 0);
 	pgd_clear(dir);
-	for (j = 0; j < PTRS_PER_PMD ; j++) {
-		prefetchw(pmd+j+(PREFETCH_STRIDE/16));
-		free_one_pmd(pmd+j);
+
+	/*
+	 * Beware if changing the loop below.  It once used int j,
+	 *	for (j = 0; j < PTRS_PER_PMD; j++)
+	 *		free_one_pmd(pmd+j);
+	 * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3)
+	 * terminated the loop with a _signed_ address comparison
+	 * using "jle", when configured for HIGHMEM64GB (X86_PAE).
+	 * If also configured for 3GB of kernel virtual address space,
+	 * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as
+	 * a pmd, when that mm exits the loop goes on to free "entries"
+	 * found at 0x80000000 onwards.  The loop below compiles instead
+	 * to be terminated by unsigned address comparison using "jb".
+	 */
+	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) {
+		prefetchw(md+(PREFETCH_STRIDE/16));
+		free_one_pmd(md);
 	}
 	pmd_free(pmd);
 }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:03 x86 question: Can a process have > 3GB memory? Clifford White
  2002-05-07 23:08 ` Robert Love
@ 2002-05-07 23:33 ` Alan Cox
  2002-05-08 16:54   ` Bill Davidsen
  2002-05-08  0:16 ` Gerrit Huizenga
  2002-05-08  8:22 ` Luigi Genoni
  3 siblings, 1 reply; 20+ messages in thread
From: Alan Cox @ 2002-05-07 23:33 UTC (permalink / raw)
  To: Clifford White; +Cc: linux-kernel

> We'd like to go > 3GB of memory per process.
> Is this possible on a 32-bit machine? I have been reading the various

Yes and no...

> Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
> Arch) page states that it's possible, but.....how?

Remember DOS and EMM memory expansion. Basically that is what you come
down to. Allocate multiple large shared memory segments, attach the one
you need each time and implement software segment swapping.

That should have you diving for an AMD hammer or IA64 box as soon as they
come out 8)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:03 x86 question: Can a process have > 3GB memory? Clifford White
  2002-05-07 23:08 ` Robert Love
  2002-05-07 23:33 ` Alan Cox
@ 2002-05-08  0:16 ` Gerrit Huizenga
  2002-05-08  0:56   ` Rik van Riel
  2002-05-08  8:22 ` Luigi Genoni
  3 siblings, 1 reply; 20+ messages in thread
From: Gerrit Huizenga @ 2002-05-08  0:16 UTC (permalink / raw)
  To: Clifford White; +Cc: linux-kernel, oliendm

Hey Cliff, we are planning to implement virtwin() if you remember
that from PTX.  AWE on NT was derived from the same work.  There
should soon be some discussion about it on lse-tech@lists.sourceforge.net
or I can give you some more data...

Worked for Oracle, should be good for large scientific apps, might
work for other piggy server applications as well.

gerrit

In message <OF4EFD903E.F8196584-ON87256BB2.007DEC69@boulder.ibm.com>, > : "Clif
ford White" writes:
> 
> We are working with a database that requires a large amount of memory
> allocated by a single process.
> This is on an Intel 32-bit platform.
> We'd like to go > 3GB of memory per process.
> Is this possible on a 32-bit machine? I have been reading the various
> 'highmem' discussions, but that's kernel page tables...
> Or is this a glibc issue, and not proper for a kernel-list question?
> Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
> Arch) page states that it's possible, but.....how?
> 
> cliffw
> NUMA-Q
> Technical Guy
> 1-503-578-4306
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08  0:16 ` Gerrit Huizenga
@ 2002-05-08  0:56   ` Rik van Riel
  2002-05-08 15:12     ` Martin J. Bligh
  2002-05-09 21:24     ` tchiwam
  0 siblings, 2 replies; 20+ messages in thread
From: Rik van Riel @ 2002-05-08  0:56 UTC (permalink / raw)
  To: Gerrit Huizenga; +Cc: Clifford White, linux-kernel, oliendm

On Tue, 7 May 2002, Gerrit Huizenga wrote:

> Hey Cliff, we are planning to implement virtwin() if you remember that
> from PTX.  AWE on NT was derived from the same work.  There should soon
> be some discussion about it on lse-tech@lists.sourceforge.net or I can
> give you some more data...

Please implement it in userspace, using large POSIX shared memory
segments and mmaping / munmapping them as needed.

This seems like a special enough case to keep it out of the kernel
entirely. If there's something not efficient enough we could work
on optimising the whole mmap & munmap path...

cheers,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:08 ` Robert Love
@ 2002-05-08  5:33   ` Martin J. Bligh
  2002-05-08  8:29   ` Andrea Arcangeli
  1 sibling, 0 replies; 20+ messages in thread
From: Martin J. Bligh @ 2002-05-08  5:33 UTC (permalink / raw)
  To: Robert Love, Clifford White; +Cc: linux-kernel

> You can go to 3.5GB, anything more and stuff starts getting real tight
> and not very nice.  You can only do 3.5/0.5 on non-PAE, though - PAE
> requires segments to be aligned on 1GB-boundaries.
> 
> The attached patch (for which credit goes elsewhere - Ingo or Randy, I
> think?) implements the full range of 1 to 3.5GB user space partitioning,
> selectable at compile-time.

The trouble with this is that on a machine with enough memory to
make it worthwhile, it normally just runs you out of zone_normal
instead ;-(

M.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:03 x86 question: Can a process have > 3GB memory? Clifford White
                   ` (2 preceding siblings ...)
  2002-05-08  0:16 ` Gerrit Huizenga
@ 2002-05-08  8:22 ` Luigi Genoni
  3 siblings, 0 replies; 20+ messages in thread
From: Luigi Genoni @ 2002-05-08  8:22 UTC (permalink / raw)
  To: Clifford White; +Cc: linux-kernel


you should be able to give to a single process till to 3.6 GB


On Tue, 7 May 2002, Clifford White wrote:

>
> We are working with a database that requires a large amount of memory
> allocated by a single process.
> This is on an Intel 32-bit platform.
> We'd like to go > 3GB of memory per process.
> Is this possible on a 32-bit machine? I have been reading the various
> 'highmem' discussions, but that's kernel page tables...
> Or is this a glibc issue, and not proper for a kernel-list question?
> Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
> Arch) page states that it's possible, but.....how?
>
> cliffw
> NUMA-Q
> Technical Guy
> 1-503-578-4306
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:08 ` Robert Love
  2002-05-08  5:33   ` Martin J. Bligh
@ 2002-05-08  8:29   ` Andrea Arcangeli
  2002-05-08 16:21     ` Robert Love
  1 sibling, 1 reply; 20+ messages in thread
From: Andrea Arcangeli @ 2002-05-08  8:29 UTC (permalink / raw)
  To: Robert Love; +Cc: Clifford White, linux-kernel

On Tue, May 07, 2002 at 04:08:55PM -0700, Robert Love wrote:
> On Tue, 2002-05-07 at 16:03, Clifford White wrote:
> 
> > We are working with a database that requires a large amount of memory
> > allocated by a single process.
> > This is on an Intel 32-bit platform.
> > We'd like to go > 3GB of memory per process.
> > Is this possible on a 32-bit machine? I have been reading the various
> > 'highmem' discussions, but that's kernel page tables...
> > Or is this a glibc issue, and not proper for a kernel-list question?
> > Any pointers would be appreciated. The Intel ESMA (Extended Server Memory
> > Arch) page states that it's possible, but.....how?
> 
> You can go to 3.5GB, anything more and stuff starts getting real tight
> and not very nice.  You can only do 3.5/0.5 on non-PAE, though - PAE
> requires segments to be aligned on 1GB-boundaries.
> 
> The attached patch (for which credit goes elsewhere - Ingo or Randy, I
> think?) implements the full range of 1 to 3.5GB user space partitioning,

actually I'm the one who wrote the 3.5G config option both in 2.2 and
recently I forward ported it to 2.4 due the number of requests I was
getting.

> selectable at compile-time.
> 
> 	Robert Love
> 




Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08  0:56   ` Rik van Riel
@ 2002-05-08 15:12     ` Martin J. Bligh
  2002-05-08 15:17       ` Rik van Riel
  2002-05-08 15:24       ` Andi Kleen
  2002-05-09 21:24     ` tchiwam
  1 sibling, 2 replies; 20+ messages in thread
From: Martin J. Bligh @ 2002-05-08 15:12 UTC (permalink / raw)
  To: Rik van Riel, Gerrit Huizenga; +Cc: Clifford White, linux-kernel, oliendm

>> Hey Cliff, we are planning to implement virtwin() if you remember that
>> from PTX.  AWE on NT was derived from the same work.  There should soon
>> be some discussion about it on lse-tech@lists.sourceforge.net or I can
>> give you some more data...
> 
> Please implement it in userspace, using large POSIX shared memory
> segments and mmaping / munmapping them as needed.
> 
> This seems like a special enough case to keep it out of the kernel
> entirely. If there's something not efficient enough we could work
> on optimising the whole mmap & munmap path...

How are you going to change the user page tables from userspace?
This mechanism would seem to need kernel support however you did it.

M.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08 15:12     ` Martin J. Bligh
@ 2002-05-08 15:17       ` Rik van Riel
  2002-05-08 15:24       ` Andi Kleen
  1 sibling, 0 replies; 20+ messages in thread
From: Rik van Riel @ 2002-05-08 15:17 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Gerrit Huizenga, Clifford White, linux-kernel, oliendm

On Wed, 8 May 2002, Martin J. Bligh wrote:

> >> Hey Cliff, we are planning to implement virtwin() if you remember that
> >> from PTX.  AWE on NT was derived from the same work.  There should soon
> >> be some discussion about it on lse-tech@lists.sourceforge.net or I can
> >> give you some more data...
> >
> > Please implement it in userspace, using large POSIX shared memory
> > segments and mmaping / munmapping them as needed.
>
> How are you going to change the user page tables from userspace?
> This mechanism would seem to need kernel support however you did it.

mmap(2) and munmap(2)

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08 15:12     ` Martin J. Bligh
  2002-05-08 15:17       ` Rik van Riel
@ 2002-05-08 15:24       ` Andi Kleen
  1 sibling, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2002-05-08 15:24 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: linux-kernel

"Martin J. Bligh" <Martin.Bligh@us.ibm.com> writes:
> 
> How are you going to change the user page tables from userspace?
> This mechanism would seem to need kernel support however you did it.

You mmap/munmap files in tmpfs. 

That is what tmpfs was developed for by SAP.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08  8:29   ` Andrea Arcangeli
@ 2002-05-08 16:21     ` Robert Love
  0 siblings, 0 replies; 20+ messages in thread
From: Robert Love @ 2002-05-08 16:21 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Clifford White, linux-kernel

On Wed, 2002-05-08 at 01:29, Andrea Arcangeli wrote:

> On Tue, May 07, 2002 at 04:08:55PM -0700, Robert Love wrote:
>
> > The attached patch (for which credit goes elsewhere - Ingo or Randy, I
> > think?) implements the full range of 1 to 3.5GB user space partitioning,
> 
> actually I'm the one who wrote the 3.5G config option both in 2.2 and
> recently I forward ported it to 2.4 due the number of requests I was
> getting.

Apologies - credit where credit is due.  It clearly came from -aa, what
with the 00_ naming prefix :)

Nice patch.

	Robert Love



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-07 23:33 ` Alan Cox
@ 2002-05-08 16:54   ` Bill Davidsen
  0 siblings, 0 replies; 20+ messages in thread
From: Bill Davidsen @ 2002-05-08 16:54 UTC (permalink / raw)
  To: Alan Cox; +Cc: Clifford White, linux-kernel

On Wed, 8 May 2002, Alan Cox wrote:

> Remember DOS and EMM memory expansion. Basically that is what you come
> down to. Allocate multiple large shared memory segments, attach the one
> you need each time and implement software segment swapping.

  I remember doing bank switching on IEEE-696 (S-100) system for both
programs and "ramdisk," when 64K (sic) was a lot of memory.
 
> That should have you diving for an AMD hammer or IA64 box as soon as they
> come out 8)

  At the moment Itanium is shipping, it makes up for being expensive by
also being slow ;-) However, Sun will sell you a 64bit UltraSPARC system
with everything but monitor for <$1K. Add a few GB and load Linux. I
*think* you can order with Linux preloaded now, but don't quote me on
that. It's not blindingly fast but it's blindingly cheap.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-08  0:56   ` Rik van Riel
  2002-05-08 15:12     ` Martin J. Bligh
@ 2002-05-09 21:24     ` tchiwam
  2002-05-09 21:40       ` Robert Love
  1 sibling, 1 reply; 20+ messages in thread
From: tchiwam @ 2002-05-09 21:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rik van Riel, Gerrit Huizenga, Clifford White, oliendm


> > Hey Cliff, we are planning to implement virtwin() if you remember that
> > from PTX.  AWE on NT was derived from the same work.  There should soon
> > be some discussion about it on lse-tech@lists.sourceforge.net or I can
> > give you some more data...
>
> Please implement it in userspace, using large POSIX shared memory
> segments and mmaping / munmapping them as needed.
>
> This seems like a special enough case to keep it out of the kernel
> entirely. If there's something not efficient enough we could work
> on optimising the whole mmap & munmap path...

How about other architectures ? like PowerPc.
Last calculation I did used 11GB of ram (no swap) on a big Number
Muncher... Would it be nice to use the same code for testing on 32
architectures with swap ?

Philippe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-09 21:24     ` tchiwam
@ 2002-05-09 21:40       ` Robert Love
  2002-05-09 23:56         ` Albert D. Cahalan
  2002-05-10 19:07         ` Bill Davidsen
  0 siblings, 2 replies; 20+ messages in thread
From: Robert Love @ 2002-05-09 21:40 UTC (permalink / raw)
  To: tchiwam
  Cc: linux-kernel, Rik van Riel, Gerrit Huizenga, Clifford White,
	oliendm

On Thu, 2002-05-09 at 14:24, tchiwam wrote:

> How about other architectures ? like PowerPc.
> Last calculation I did used 11GB of ram (no swap) on a big Number
> Muncher... Would it be nice to use the same code for testing on 32
> architectures with swap ?

All 32-bit architectures have a 4GB address space, 64-bit architectures
obviously have a much bigger one (depends on the arch how many bits are
used for the address space).

PPC obviously does not have the dumb physical memory limitations x86
has, however.

Anyhow, Rik's mmap trick will work on any arch, not just x86.

	Robert Love


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-09 21:40       ` Robert Love
@ 2002-05-09 23:56         ` Albert D. Cahalan
  2002-05-10  6:58           ` Anton Blanchard
  2002-05-10 19:07         ` Bill Davidsen
  1 sibling, 1 reply; 20+ messages in thread
From: Albert D. Cahalan @ 2002-05-09 23:56 UTC (permalink / raw)
  To: Robert Love
  Cc: tchiwam, linux-kernel, Rik van Riel, Gerrit Huizenga,
	Clifford White, oliendm

Robert Love writes:
> On Thu, 2002-05-09 at 14:24, tchiwam wrote:

>> How about other architectures ? like PowerPc.
>> Last calculation I did used 11GB of ram (no swap) on a big Number
>> Muncher... Would it be nice to use the same code for testing on 32
>> architectures with swap ?
>
> All 32-bit architectures have a 4GB address space, 64-bit architectures
> obviously have a much bigger one (depends on the arch how many bits are
> used for the address space).
>
> PPC obviously does not have the dumb physical memory limitations x86
> has, however.

Huh? Unless you mean ppc64, ppc is worse.
On a Mac, you get 2 GB of virtual memory per
process. You get up to 512 MB of physical memory
without highmem support, or usually 4 GB with
highmem support. As with x86, the latest chips
offer a 36-bit (64 GB) physical address space.

Virtual memory layout:

00000000-7fffffff user
80000000-bfffffff waste (for IO on obscure Amiga "upgrade" junk)
c0000000-dfffffff non-paged mapping of 512 MB at phys addr 0
e0000000-ffffffff IO, vmalloc(), etc.

That's not all! Linus recently singled out the PowerPC
MMU for a nice long abusive rant. :-) You get hashed
page tables. You get this:

As with x86, segment registers map a 32-bit virtual
address space onto a larger one. The top 4 bits of
a 32-bit virtual address are used to select a segment,
and the segment provides 24 more address bits to
give you a 52-bit virtual address. Eeeew.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-09 23:56         ` Albert D. Cahalan
@ 2002-05-10  6:58           ` Anton Blanchard
  0 siblings, 0 replies; 20+ messages in thread
From: Anton Blanchard @ 2002-05-10  6:58 UTC (permalink / raw)
  To: Albert D. Cahalan
  Cc: Robert Love, tchiwam, linux-kernel, Rik van Riel, Gerrit Huizenga,
	Clifford White, oliendm


> Huh? Unless you mean ppc64, ppc is worse.
> On a Mac, you get 2 GB of virtual memory per
> process. You get up to 512 MB of physical memory
> without highmem support, or usually 4 GB with
> highmem support. 

This is fixed in recent kernels, you can specify it with a CONFIG
option:

# uname -a
Linux 2.4.19-pre5 #185 Fri Apr 5 14:36:40 EST 2002 ppc

# cat /proc/self/maps  
0fea5000-0ffbb000 r-xp 00000000 03:0c 163581     /lib/libc-2.2.5.so
0ffbb000-0ffc5000 ---p 00116000 03:0c 163581     /lib/libc-2.2.5.so
0ffc5000-0ffeb000 rw-p 00110000 03:0c 163581     /lib/libc-2.2.5.so
0ffeb000-0fff0000 rw-p 00000000 00:00 0
10000000-10003000 r-xp 00000000 03:0c 1554092    /bin/cat
10012000-10013000 rw-p 00002000 03:0c 1554092    /bin/cat
10013000-10015000 rwxp 00000000 00:00 0
48000000-48014000 r-xp 00000000 03:0c 163537     /lib/ld-2.2.5.so
48014000-48015000 rw-p 00000000 00:00 0
48023000-48027000 rw-p 00013000 03:0c 163537     /lib/ld-2.2.5.so
bfffe000-c0000000 rwxp fffff000 00:00 0

Of course on ppc64 kernels you have a full 4GB of userspace since
the kernel sits at the top of the 64bit address space.

> As with x86, the latest chips
> offer a 36-bit (64 GB) physical address space.

Paulus also has a hack that allows up to 15G of memory on POWER3 class
machines although that isnt currently in the ppc32 tree.

> That's not all! Linus recently singled out the PowerPC
> MMU for a nice long abusive rant. :-) You get hashed
> page tables. You get this:
> 
> As with x86, segment registers map a 32-bit virtual
> address space onto a larger one. The top 4 bits of
> a 32-bit virtual address are used to select a segment,
> and the segment provides 24 more address bits to
> give you a 52-bit virtual address. Eeeew.

This is all very well and good, put my ppc64 machine still outperforms
anything out there on the kernel compile benchmark :)

Anton

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-09 21:40       ` Robert Love
  2002-05-09 23:56         ` Albert D. Cahalan
@ 2002-05-10 19:07         ` Bill Davidsen
  2002-05-10 19:42           ` Alan Cox
  1 sibling, 1 reply; 20+ messages in thread
From: Bill Davidsen @ 2002-05-10 19:07 UTC (permalink / raw)
  To: Robert Love; +Cc: Linux-Kernel Mailing List

On 9 May 2002, Robert Love wrote:

> All 32-bit architectures have a 4GB address space, 64-bit architectures
> obviously have a much bigger one (depends on the arch how many bits are
> used for the address space).
> 
> PPC obviously does not have the dumb physical memory limitations x86
> has, however.

As others have noted, the recent ia32 chips support 36 (or more) bits of
physical memory, and there is even code to use it AFAIK in the current
kernel. It would be possible to allow program access to this RAM, although
both Kernel and gcc support would be needed. M$ had "huge" memory models
to go over 64k in the old 8086 days, doing loads of segment registers.
 
> Anyhow, Rik's mmap trick will work on any arch, not just x86.

Rik's mmap trick is like the dancing elephant, "the wonder is not that he
does it well but that he does it at all." In the first place most
programmers could not get the code to work reliably in a realistic time
frame (if at all) due to complexity, and if they did the implementation
would not be usefully fast for random access, which is why you use memory
in the most cases. 

Imagine *a++ = *b++ with four system calls per byte. Or imagine an FFT,
where even if you could do range checking to see if mmap() was needed you
would still add multiples to the integer portion, and probably beat the
cache to a pulp.

As a technique for special applications and programmers it works well, but
as a general solution it is totally impractical in both time to code and
time to run. Not to mention portability issues to 64 bit hardware, where
you need still other code unless you want to run multiples slower.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-10 19:42           ` Alan Cox
@ 2002-05-10 19:41             ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2002-05-10 19:41 UTC (permalink / raw)
  To: linux-kernel

In article <E176GHv-0006ee-00@the-village.bc.nu>,
Alan Cox  <alan@lxorguk.ukuu.org.uk> wrote:
>> kernel. It would be possible to allow program access to this RAM, although
>> both Kernel and gcc support would be needed. M$ had "huge" memory models
>> to go over 64k in the old 8086 days, doing loads of segment registers.
>
>Alas that is not quite the case. You still have a 4Gb virtual address
>space. If you want > 32bits, get a > 32bit processor. This one isnt as
>simple as add segmentation and 'large model'

Well, you _could_ use the P bit on the segments and "page" them in on
demand with mmap. That would get you a model very similar to the old
16-big large model: no single object can be bigger than 2GB, but you can
have a total object size of something like 40 bits.

No kernel support needed, actually. It's all there with the LDT stuff.

But yes, compiler support and a recompiled glibc. And it would break all
programs that assume a flat address space.

And it would really _suck_ performance-wise if your working set is big
enough to cause you to have to switch mmap's a lot.

			Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: x86 question: Can a process have > 3GB memory?
  2002-05-10 19:07         ` Bill Davidsen
@ 2002-05-10 19:42           ` Alan Cox
  2002-05-10 19:41             ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: Alan Cox @ 2002-05-10 19:42 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Robert Love, Linux-Kernel Mailing List

> kernel. It would be possible to allow program access to this RAM, although
> both Kernel and gcc support would be needed. M$ had "huge" memory models
> to go over 64k in the old 8086 days, doing loads of segment registers.

Alas that is not quite the case. You still have a 4Gb virtual address
space. If you want > 32bits, get a > 32bit processor. This one isnt as
simple as add segmentation and 'large model'

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2002-05-10 19:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-07 23:03 x86 question: Can a process have > 3GB memory? Clifford White
2002-05-07 23:08 ` Robert Love
2002-05-08  5:33   ` Martin J. Bligh
2002-05-08  8:29   ` Andrea Arcangeli
2002-05-08 16:21     ` Robert Love
2002-05-07 23:33 ` Alan Cox
2002-05-08 16:54   ` Bill Davidsen
2002-05-08  0:16 ` Gerrit Huizenga
2002-05-08  0:56   ` Rik van Riel
2002-05-08 15:12     ` Martin J. Bligh
2002-05-08 15:17       ` Rik van Riel
2002-05-08 15:24       ` Andi Kleen
2002-05-09 21:24     ` tchiwam
2002-05-09 21:40       ` Robert Love
2002-05-09 23:56         ` Albert D. Cahalan
2002-05-10  6:58           ` Anton Blanchard
2002-05-10 19:07         ` Bill Davidsen
2002-05-10 19:42           ` Alan Cox
2002-05-10 19:41             ` Linus Torvalds
2002-05-08  8:22 ` Luigi Genoni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox