public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: torvalds@transmeta.com (Linus Torvalds)
To: linux-kernel@vger.kernel.org
Subject: Re: [Lse-tech] Re: 10.31 second kernel compile
Date: Fri, 15 Mar 2002 18:20:17 +0000 (UTC)	[thread overview]
Message-ID: <a6te11$si7$1@penguin.transmeta.com> (raw)
In-Reply-To: <20020313085217.GA11658@krispykreme> <20020314112725.GA2008@krispykreme> <87wuwfxp25.fsf@fadata.bg> <E16la2m-0000SX-00@starship>

In article <E16la2m-0000SX-00@starship>,
Daniel Phillips  <phillips@bonn-fries.net> wrote:
>On March 14, 2002 02:21 pm, Momchil Velikov wrote:
>> 
>> Out of curiousity, why there's a need to update the linux page tables ?
>> Doesn't pte/pmd/pgd family functions provide enough abstraction in
>> order to maintain _only_ the hashed page table ?
>
>No, it's hardwired to the x86 tree view of page translation.

No no no.

If you think that, then you don't see the big picture.

In fact, when I did the 3-level page tables for Linux, no x86 chips that
could _use_ three levels actually existed.

The Linux MM was actually _designed_ for portability when I did the port
to alpha (oh, that's a long time ago). I even wrote my masters thesis on
why it was done the way it was done (the only actual academic use I ever
got out of the whole Linux exercise ;)

Yes a tree-based page table matches a lot of hardware architectures very
well.  And it's _not_ just x86: it also matches soft-fill TLB's better
than alternatives (newer sparcs and MIPS), and matches a number of other
architecture specifications (eg alpha, m68k). 

So on about 50% of architectures (and 99.9% of machines), the Linux MM
data structures can be made to map 1:1 to the hardware constructs, so
that you avoid duplicate information. 

But more importantly than that, the whole point really is that the page
table tree as far as Linux is concerned is nothing but an _abstraction_
of the VM mapping hardware. It so happens that a tree format is the only
sane format to keep full VM information that works well with real loads.

Whatever the hardware actually does, Linux considers that to be noting
but an extended TLB.  When you can make the MM software tree map 1:1
with the extended TLB (as on x86), you win in memory usage and in
cheaper TLB invalidates, but you _could_ (if you wanted to) just keep
two separate trees.  In fact, with the rmap patches, that's exactly what
you see: the software tree is _not_ 1:1 with the hardare tree any more
(but it _is_ a proper superset, so that you can still get partial
sharing and still get the cheaper TLB updates). 

Are there machines where the sharing between the software abstraction
and the hardware isn't as total? Sure. But if you actually know how
hashed page tables work on ppc, you'd be aware of the fact that they
aren't actualy able to do a full VM mapping - when a hash chain gets too
long, the hardware is no longer able to look it up ("too long" being 16
entries on a PPC, for example).

And that's a common situation with non-tree VM representations - they
aren't actually VM representations, they are just caches of what the
_real_ representation is.  And what do we call such caches? Right: they
are nothing but a TLB. 

So the fact is, the Linux tree-based VM has _nothing_ to do with x86
tree-basedness, and everything to do with the fact that it's the only
sane way to keep VM information. 

The fact that it maps 1:1 to the x86 trees with the "folding" of the mid
layer was a design consideration, for sure.  Being efficient and clever
is always good.  But the basic reason for tree-ness lies elsewhere. 
(The basic reasons for tree-ness is why so many architectures _do_ use a
tree-based page table - you should think of PPC and ia64 as the sick
puppies who didn't understand.  Read the PPC documentation on virtual
memory, and you'll see just _how_ sick they are). 

			Linus

  parent reply	other threads:[~2002-03-15 18:22 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-13  8:52 10.31 second kernel compile Anton Blanchard
2002-03-13 14:44 ` Martin J. Bligh
2002-03-13 21:44   ` [Lse-tech] " Dave Hansen
2002-03-14  1:07     ` Keith Owens
2002-03-14 11:27   ` Anton Blanchard
2002-03-14 13:16     ` [Lse-tech] " Dipankar Sarma
2002-03-17 13:12       ` some RCU dcache and ratcache results Anton Blanchard
2002-03-14 13:21     ` [Lse-tech] Re: 10.31 second kernel compile Momchil Velikov
2002-03-14 18:33       ` Daniel Phillips
2002-03-15 12:16         ` Chris Wedgwood
2002-03-16  5:12           ` Anton Blanchard
2002-03-15 18:20         ` Linus Torvalds [this message]
2002-03-16 11:55           ` Paul Mackerras
2002-03-16 17:25             ` Rik van Riel
2002-03-16 17:57             ` yodaiken
2002-03-16 18:06             ` Linus Torvalds
2002-03-16 18:35               ` yodaiken
2002-03-16 18:45                 ` Linus Torvalds
2002-03-16 18:57                   ` yodaiken
2002-03-16 19:16                     ` Linus Torvalds
2002-03-16 19:43                       ` David Mosberger
2002-03-16 19:58                         ` Linus Torvalds
2002-03-16 20:08                           ` yodaiken
2002-03-16 20:23                             ` Linus Torvalds
2002-03-16 20:36                           ` David Mosberger
2002-03-16 20:46                             ` Linus Torvalds
2002-03-17  1:09                               ` Paul Mackerras
2002-03-17  2:08                                 ` Linus Torvalds
2002-03-16 19:53                       ` yodaiken
2002-03-16 20:02                         ` Linus Torvalds
2002-03-16 20:25                           ` yodaiken
2002-03-27  1:07                       ` Richard Henderson
2002-03-16 20:53               ` Alan Cox
2002-03-18  3:07             ` David S. Miller
2002-03-16 15:24           ` Daniel Phillips
2002-03-16 19:01             ` Linus Torvalds
2002-03-16 22:25               ` Daniel Phillips
2002-03-19 16:35                 ` Bill Davidsen
2002-03-14 19:05       ` Linus Torvalds
2002-03-19 16:40         ` Bill Davidsen
2002-03-14 18:21     ` Hanna Linder
2002-03-16  5:27       ` Anton Blanchard
2002-03-15  7:12   ` Chris Wedgwood
2002-03-16  6:15 ` 7.52 " Anton Blanchard
2002-03-16  6:42   ` [Lse-tech] " Gerrit Huizenga
2002-03-17 12:34     ` Anton Blanchard
2002-03-17 22:09       ` Theodore Tso
2002-03-18  7:04         ` Jeff Garzik
2002-03-19 18:28           ` Theodore Tso
2002-03-16  8:05   ` Linus Torvalds
2002-03-16 11:04     ` Paul Mackerras
2002-03-16 18:32       ` Linus Torvalds
2002-03-17  2:00         ` Paul Mackerras
2002-03-17  2:40           ` Linus Torvalds
2002-03-17  2:50             ` M. Edward Borasky
2002-03-18 15:08               ` 0.73 " snpe
2002-03-18 19:42           ` 7.52 " Cort Dougan
2002-03-18 20:04             ` Linus Torvalds
2002-03-18 20:23               ` Linus Torvalds
2002-03-18 21:50                 ` Rene Herman
2002-03-18 22:36                 ` Cort Dougan
2002-03-18 22:47                   ` Linus Torvalds
2002-03-18 22:56                     ` Cort Dougan
2002-03-18 23:52                     ` Paul Mackerras
2002-03-19  0:57                       ` Dave Jones
2002-03-19  3:35                         ` Jeff Garzik
2002-03-19  0:22                     ` David S. Miller
2002-03-19  0:27                       ` Cort Dougan
2002-03-19  0:27                         ` David S. Miller
2002-03-19  0:36                           ` Cort Dougan
2002-03-19  0:38                             ` David S. Miller
2002-03-19  1:28                               ` Davide Libenzi
2002-03-19  2:42                 ` Paul Mackerras
2002-03-27  2:53                 ` Richard Henderson
2002-04-02  4:32                   ` Linus Torvalds
2002-04-02 10:50                 ` Pablo Alcaraz
2002-03-18 21:34               ` Cort Dougan
2002-03-18 22:00                 ` Linus Torvalds
2002-03-18 19:37       ` Cort Dougan
2002-03-16 11:54     ` yodaiken
2002-03-16 17:37   ` [Lse-tech] " Martin J. Bligh
2002-03-16 18:57     ` Daniel Egger
2002-03-17  8:18       ` Mike Galbraith
2002-03-17 15:29         ` Martin J. Bligh
2002-03-17  1:45     ` Keith Owens
2002-03-17 13:54     ` David Woodhouse
2002-03-19 16:49     ` Bill Davidsen
     [not found] <20020316113536.A19495@hq.fsmlabs.com.suse.lists.linux.kernel>
     [not found] ` <Pine.LNX.4.33.0203161037160.31913-100000@penguin.transmeta.com.suse.lists.linux.kernel>
     [not found]   ` <20020316115726.B19495@hq.fsmlabs.com.suse.lists.linux.kernel>
2002-03-16 19:32     ` [Lse-tech] Re: 10.31 " Andi Kleen
2002-03-16 19:57       ` yodaiken
2002-03-16 20:05         ` Andi Kleen
2002-03-16 20:12           ` yodaiken
2002-03-16 20:27             ` Richard Gooch
2002-03-16 20:47               ` yodaiken
2002-03-16 21:05                 ` Richard Gooch
2002-03-16 23:34                   ` yodaiken
2002-03-17 13:48                   ` Rik van Riel
2002-03-16 20:34             ` Linus Torvalds
2002-03-16 21:39               ` yodaiken
2002-03-16 21:49                 ` Linus Torvalds
2002-03-17 14:38                   ` Kai Henningsen
2002-03-17 18:20                     ` Alan Cox
2002-03-16 22:00                 ` Alan Cox
2002-03-16 21:49                   ` Linus Torvalds
2002-03-16 23:10                   ` yodaiken
2002-03-17  1:17                     ` rddunlap
2002-03-17  3:34                     ` Alan Cox
2002-03-17 14:52                     ` Kai Henningsen
2002-03-17 21:00                       ` yodaiken
2002-03-19 12:06                 ` Pavel Machek
2002-03-19 21:12                   ` yodaiken
2002-03-19 22:09                     ` Chris Friesen
2002-03-19 22:15                       ` yodaiken
2002-03-20  4:25                     ` Bill Davidsen
2002-03-17  2:50           ` Chris Wedgwood
2002-03-17  3:43             ` Alan Cox
2002-03-17  4:12               ` Chris Wedgwood
2002-03-17  4:31                 ` Alan Cox
2002-03-16 20:14         ` Linus Torvalds
2002-03-16 20:22           ` Andi Kleen
2002-03-19  4:34             ` Rusty Russell
2002-03-16 20:36           ` Richard Gooch
2002-03-16 20:38             ` Linus Torvalds
2002-03-16 20:51               ` Richard Gooch
2002-03-17 13:23           ` Rik van Riel
2002-03-17 18:16             ` Linus Torvalds
2002-03-17 23:01               ` Davide Libenzi
2002-03-18  0:53                 ` Rik van Riel
2002-03-18  1:13                   ` Davide Libenzi
2002-03-18  1:31                     ` Linus Torvalds
2002-03-18  1:56                       ` Davide Libenzi
2002-03-18  1:40                     ` Mike Fedyk
2002-03-18  1:48                       ` Davide Libenzi
2002-03-24 21:12           ` Rogier Wolff
2002-03-24 21:35             ` Andrew Morton
2002-03-24 22:54               ` Nick Craig-Wood
2002-03-24 23:41                 ` Andi Kleen
2002-03-25  6:40               ` Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='a6te11$si7$1@penguin.transmeta.com' \
    --to=torvalds@transmeta.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox