From: "David S. Miller" <davem@redhat.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: wli@holomorphy.com, linux-arch@vger.kernel.org
Subject: Re: copy_page_range()
Date: Wed, 11 Aug 2004 13:45:07 -0700 [thread overview]
Message-ID: <20040811134507.714f2a38.davem@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0408110856080.1839@ppc970.osdl.org>
On Wed, 11 Aug 2004 09:13:36 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:
> Hmm.. I don't see any of this being arch-dependent, so I wonder why you
> did it that way.
I'm trying to achieve two goals. The first I've demonstrated is
achieveable, the second is still not fully grasped yet.
Firstly, I wanted to get clear_page_tables() out of my profiles.
Secondly, I wanted to abstract out completely the page table
traversing the generic kernel does.
I want the latter so I can experiment with different data structures
for page tables, and the current pgd/pmd/pte array assumptions in
the kernel generic vm code disallow any kind of tinkering in that
area.
If we end up with an interface that says: "walk page tables for vaddr
range 'start' to 'end', and do func() for each pte" then anything can
be experimented with.
You're absolutely right, and I've mentioned this earlier in this thread,
that the current page tables are way too sparse. On 64-bit a simple
hello world program with a 3-level page table looks roughly like:
PGD_BASE:
...
X --> PMD_BASE1
...
Y --> PTE_BASE1
... some ptes ...
...
Z --> PMD_BASE2
...
A --> PTE_BASE2
... some ptes ...
...
B --> PMD_BASE3
...
C --> PTE_BASE3
... some ptes ...
...
...
The X-->Y branch is for the program text.
The Z-->A branch is for the dynamic mmap() area (shared libraries,
anonymous mmaps, etc.)
The B-->C branch is for the program stack.
We've got maybe 10 to 20 present pte's in this tree.
On sparc64 pgd_t and pmd_t are both 32-bit (this is in order to
encode the most address space possible, we can encode the full
physical address by simply shifting out the page offset bits)
So each pgd_t table holds 2048 entries as does each pmd_t table.
Therefore, in the above example during clear_page_tables() we'd
scan 2048 pgd's, 3 * 2048 pmd's and 3 * 1024 pte's.
That's 7 * 8192 (PAGE_SIZE) byte worth of pointer derefing.
It's no wonder this shows up in the profiles. All of that just
for 10 to 20 actual user mappings. This is broken.
I want to try and use a less sparse data structure on sparc
just for the pgd/pmd level, and use pages of ptes for the pte_t
level as those tend to be well populated. I also need to retain
the pte_t level as a full page due to the virtual linear page table
stuff I do to speed up TLB miss processing (roughly the same as
what ia64 does).
I can't experiment with all the generic code assuming these things
are arrays.
next prev parent reply other threads:[~2004-08-11 20:46 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-07 7:05 copy_page_range() David S. Miller
2004-08-07 8:07 ` copy_page_range() William Lee Irwin III
2004-08-11 7:07 ` copy_page_range() David S. Miller
2004-08-11 7:35 ` copy_page_range() William Lee Irwin III
2004-08-11 16:13 ` copy_page_range() Linus Torvalds
2004-08-11 20:45 ` David S. Miller [this message]
2004-08-12 3:53 ` copy_page_range() David S. Miller
2004-08-09 9:01 ` copy_page_range() David Mosberger
2004-08-09 9:04 ` copy_page_range() William Lee Irwin III
2004-08-09 9:27 ` copy_page_range() David Mosberger
2004-08-09 9:29 ` copy_page_range() William Lee Irwin III
2004-08-09 10:01 ` copy_page_range() David Mosberger
2004-08-09 17:46 ` copy_page_range() David S. Miller
2004-08-09 17:08 ` copy_page_range() Linus Torvalds
2004-08-09 18:49 ` copy_page_range() William Lee Irwin III
2004-08-09 17:45 ` copy_page_range() David S. Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040811134507.714f2a38.davem@redhat.com \
--to=davem@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=torvalds@osdl.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox