public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* The 3G (or nG) Kernel Memory Space Offset
@ 2006-08-29 14:15 Dong Feng
  2006-08-29 14:32 ` Andi Kleen
  2006-08-29 14:36 ` Jan Engelhardt
  0 siblings, 2 replies; 14+ messages in thread
From: Dong Feng @ 2006-08-29 14:15 UTC (permalink / raw)
  To: Andi Kleen, Nick Piggin, Arjan van de Ven, Dong Feng,
	Paul Mackerras, Christoph Lameter, David Howells
  Cc: linux-kernel

The Linux kernel permenantly map 3-4G linear memory space to 0-4G
physical memory space. My question is that what is the rationality
behind this counterintuitive mapping. Is this just some personal
choice for the earlier kernel developers?

^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: The 3G (or nG) Kernel Memory Space Offset
@ 2006-08-30  4:32 linux
  0 siblings, 0 replies; 14+ messages in thread
From: linux @ 2006-08-30  4:32 UTC (permalink / raw)
  To: middle.fengdong; +Cc: linux-kernel

Just to answer the question in elementary terms:

This is because:
- On x86, the user and kernel share the available 4G virtual address space,
- User space gets first choice, and so takes the low 3G.
- The kernel thus has to use the high 1G, and if it wants a copy
  of physical memory, that's the only place it can go.

In somewhat more detail:

1) In standard x86 Linux, the user and kernel address spaces share the 4
   GB virtual address space of the x86 processor.  There are other ways
   to do it (see the 4G+4G patch for an example), but they're slower.

   x86 processors only support one set of page tables at a time, and
   changing is a slow operation.  Other processors let you have separate
   user and kernel page tables active simultaneously, but x86 does not.

   So for speed, you don't want to change page tables to make a system
   call.  Also, many system calls are passed pointers to buffers in user
   memory, so need to access user memory.  It's fastest and easiest to do
   this if user memory is in the address space when executing kernel code.

   Fortunately, x86 page tables have a "user" bit in each page table
   entry, that can make pages only accessible from the kernel.  They are
   still in the user's virtual address space, but can't be accessed.
   Thus, it is possible for the user and kernel to share the address space.

   So, given all of this, Linux (as well as most other operating systems)
   on x86 has decided to divide the 4 GB virtual address space into "user"
   and "kernel" parts.  As far as the user is concerned, the kernel part
   is just "missing", so it's made as as small as reasonably possible.

2) The division chosen is that the user gets the low 3G of the address
   space, and the kernel gets the high 1G.  x86 ABI standards require
   that user space gets low addresses, and in any case, the kernel exists
   to make user-space programs happy.

3) The kernel finds it convenient to have a copy of physical memory in its
   address space, so it maps one.  If there's more RAM than will fit in the
   kernel address space, the HIGHMEM patches provide an alternative.
   Since this is an elementary explanation, I won't describe how that works.

Thus, the physical memory map used in the kernel ends up offset by 3G.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2006-08-30  6:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-29 14:15 The 3G (or nG) Kernel Memory Space Offset Dong Feng
2006-08-29 14:32 ` Andi Kleen
2006-08-29 14:36 ` Jan Engelhardt
2006-08-29 16:01   ` Dong Feng
2006-08-29 16:05     ` Arjan van de Ven
2006-08-29 16:30       ` Jan Engelhardt
2006-08-29 16:44         ` Jeremy Fitzhardinge
2006-08-29 16:12     ` Christoph Lameter
2006-08-29 16:16       ` Dong Feng
2006-08-29 16:42     ` Jeremy Fitzhardinge
2006-08-29 18:37       ` Peter Grandi
2006-08-29 21:15         ` Jeremy Fitzhardinge
2006-08-30  6:11           ` Jan Engelhardt
  -- strict thread matches above, loose matches on Subject: below --
2006-08-30  4:32 linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox