From: Andrea Arcangeli <andrea@suse.de>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Martin J. Bligh" <Martin.Bligh@us.ibm.com>,
William Lee Irwin III <wli@holomorphy.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
"M. Edward Borasky" <znmeb@aracnet.com>,
linux-kernel@vger.kernel.org, riel@surriel.com, akpm@zip.com.au
Subject: Re: Have the 2.4 kernel memory management problems on large machines been fixed?
Date: Thu, 23 May 2002 00:35:12 +0200 [thread overview]
Message-ID: <20020522223512.GN21164@dualathlon.random> (raw)
In-Reply-To: <384590000.1022102334@flay> <Pine.LNX.4.33.0205221421180.1531-100000@penguin.transmeta.com>
On Wed, May 22, 2002 at 02:23:39PM -0700, Linus Torvalds wrote:
>
> On Wed, 22 May 2002, Martin J. Bligh wrote:
> >
> > If we could get the apps (well, Oracle) to co-operate, we could just use
> > clone ;-) Having this transparent for shmem segments would be really nice.
>
> The thing is, we won't get Oracle to rewrite a lot for a completely
> threaded system. And clone does _not_ come with a way to share only parts
actually not using threads also provides a bit more of protection across
the different ""threads"", but OTOH the shm part could be corrupted
anyways if there's a bug.
> of the VM, and never will - that's fundamentally against the way "struct
> mm_struct" works.
>
> Oracle is apparently already used to magic shmem-like things, so doing
> that is probably acceptable to them.
For x86 using largepages is the first prio, the relevance of sharing
pagetables is near to nothing compared to 4M pages. As HPA said at the
last kernel summit during the commetary of either the VM or the Oracle
speech (and I'm not sure if everybody understood what he said), without
PAE 4M pages just provides shared pagetables because there's nothing
anymore to share, and no the pgd cannot be shared because the fact is
not clone() is our problem since the first place.
With PAE there's to share the pmd but if you actually do the math the
pmd for a worth of 3G of address space the pmd is 12k per task,
it doesn't really matter at all, with 4000 tasks all mapping the same
1.5G shm segment the amount of sharable pmd memory is reduced to
24Mbytes, who cares about 24Mbytes with 4000 tasks working on 1.5G of
ram each (in particular with a much more powerful tlb caching on such
ram virtual addresses)? At the very least it is a very secondary
interest, and it cannot make differnces at all without PAE.
On x86-64 as well we use a single top level pte for all the tasks (only
the first top level entry changes in the mm switches) so again using 2M
pages would be more than enough there too as much as in x86 PAE (like if
x86-64 would be 3 level pages like PAE x86). Of course there we can go
way above 1.5G of shm mapped per task, so at some point as the shm size
incrase, the pmd sharing may get more relevance but I don't see it as a
short term issue at least.
As last thing, we'll also need to enforce a large enough MAX_ORDER to be
able to allocate 2/4M pages, I hoped we could turn it down and to save
cpu cycles as soon as all the hashtable allocations will be finally
rewritten to use the bootmem allocator, and that won't be possible
anymore, but it's not a showstopper, desktops are idle all the time
anyways.
Andrea
next prev parent reply other threads:[~2002-05-22 22:37 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-22 6:51 2.4.19pre*: IO statistics in /proc/partitions corrupt Jochen Suckfuell
2002-05-22 14:00 ` Have the 2.4 kernel memory management problems on large machines been fixed? M. Edward Borasky
2002-05-22 14:08 ` bert hubert
2002-05-22 14:55 ` Alan Cox
2002-05-22 15:56 ` Martin J. Bligh
2002-05-22 16:23 ` Alan Cox
2002-05-22 21:46 ` Doug Ledford
2002-05-22 14:36 ` William Lee Irwin III
2002-05-22 15:44 ` Martin J. Bligh
2002-05-22 15:53 ` Martin J. Bligh
2002-05-22 16:07 ` William Lee Irwin III
2002-05-22 16:36 ` Martin J. Bligh
2002-05-22 17:21 ` Andrea Arcangeli
2002-05-22 18:18 ` Martin J. Bligh
2002-05-22 18:02 ` Alan Cox
2002-05-22 18:08 ` Linus Torvalds
2002-05-22 18:30 ` Rik van Riel
2002-05-22 18:40 ` Linus Torvalds
2002-05-22 18:48 ` Martin J. Bligh
2002-05-22 18:34 ` Have the 2.4 kernel memory management problems on large machines Alan Cox
2002-05-22 18:27 ` Linus Torvalds
2002-05-22 20:30 ` Have the 2.4 kernel memory management problems on large machines been fixed? William Lee Irwin III
2002-05-22 21:18 ` Martin J. Bligh
2002-05-22 21:23 ` Linus Torvalds
2002-05-22 22:35 ` Andrea Arcangeli [this message]
2002-05-22 22:44 ` Martin J. Bligh
2002-05-28 2:08 ` Wim Coekaerts
2002-05-31 20:39 ` Andrea Arcangeli
2002-05-23 14:16 ` Bill Davidsen
2002-05-23 17:18 ` Linus Torvalds
2002-05-23 19:34 ` Bill Davidsen
2002-05-23 19:46 ` Linus Torvalds
2002-05-22 18:38 ` Martin J. Bligh
2002-05-22 17:50 ` Alan Cox
2002-05-22 17:54 ` J Sloan
2002-05-22 18:22 ` Have the 2.4 kernel memory management problems on large machines Alan Cox
2002-05-22 22:14 ` J Sloan
2002-05-22 18:24 ` Have the 2.4 kernel memory management problems on large machines been fixed? Martin J. Bligh
2002-05-22 22:05 ` Alan Cox
-- strict thread matches above, loose matches on Subject: below --
2002-05-22 14:29 Alastair Stevens
[not found] <E17AaR0-0002QM-00@the-village.bc.nu.suse.lists.linux.kernel>
[not found] ` <Pine.LNX.4.33.0205221048570.23621-100000@penguin.transmeta.com.suse.lists.linux.kernel>
2002-05-22 20:23 ` Andi Kleen
2002-05-22 20:58 ` Linus Torvalds
2002-05-23 12:40 ` Mike Jagdis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020522223512.GN21164@dualathlon.random \
--to=andrea@suse.de \
--cc=Martin.Bligh@us.ibm.com \
--cc=akpm@zip.com.au \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@surriel.com \
--cc=torvalds@transmeta.com \
--cc=wli@holomorphy.com \
--cc=znmeb@aracnet.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox