public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin J. Bligh" <mbligh@aracnet.com>
To: Andrea Arcangeli <andrea@suse.de>, Andrew Morton <akpm@osdl.org>
Cc: Rik van Riel <riel@redhat.com>, linux-kernel@vger.kernel.org
Subject: Re: 2.4.23aa2 (bugfixes and important VM improvements for the high end)
Date: Fri, 27 Feb 2004 14:03:07 -0800	[thread overview]
Message-ID: <162060000.1077919387@flay> (raw)
In-Reply-To: <20040227211548.GI8834@dualathlon.random>

> note that the 4:4 split is wrong in 99% of cases where people needs 64G
> gigs. I'm advocating strongly for the 2:2 split to everybody I talk
> with, I'm trying to spread the 2:2 idea because IMHO it's an order of
> magnitude simpler and an order of magnitude superior. Unfortunately I
> could get not a single number to back my 2:2 claims, since the 4:4
> buzzword is spreading and people only test with 4:4. so it's pretty hard
> for me to spread the 2:2 buzzword.

For the record, I for one am not opposed to doing 2:2 instead of 4:4.
What pisses me off is people trying to squeeze large amounts of memory
into 3:1, and distros pretending it's supportable, when it's never 
stable across a broad spectrum of workloads. Between 2:2 and 4:4,
it's just a different overhead tradeoff.

> 4:4 makes no sense at all, the only advantage of 4:4 w.r.t. 2:2 is that
> they can map 2.7G per task of shm instead of 1.7G per task of shm.

Eh? You have a 2GB difference of user address space, and a 1GB difference
of shm size. You lost a GB somewhere ;-) Depending on whether you move
TASK_UNMAPPPED_BASE or not, it you might mean 2.7 vs 0.7 or at a pinch
3.5 vs 1.5, I'm not sure.

> syscall and irq. I expect the databases will run an order of magnitude
> faster with _2:2_ in a 64G configuration, with _1.7G_ per process of shm
> mapped, instead of their 4:4 split with 2.7G (or more, up to 3.9 ;)
> mapped per task.

That may well be true for some workloads, I suspect it's slower for others.
One could call the tradeoff either way.
 
> I don't mind if 4:4 gets merged but I recommend db vendors to benchmark
> _2:2_ against 4:4 before remotely considering deploying 4:4 in
> production.  Then of course let me know since I had not the luck to get
> any number back and I've no access to any 64G box.

If you send me a *simple* simulation test, I'll gladly run it for you ;-)
But I'm not going to go fiddle with Oracle, and thousands of disks ;-)

> I don't care about 256G with 2:2 split, since intel and hp are now going
> x86-64 too.

Yeah, I don't think we ever need to deal with that kind of insanity ;-)
 
>> averse to objrmap for file-backed mappings either - I agree that the search
>> problems which were demonstrated are unlikely to bite in real life.
> 
> cool.
> 
> Martin's patch from IBM is a great start IMHO. I found a bug in the vma
> flags check though, VM_RESERVED should be checked too, not only
> VM_LOCKED, unless I'm missing something, but it's a minor issue.

I didn't actually write it - that was Dave McCracken ;-) I just suggested
the partial aproach (because I'm dirty and lazy ;-)) and carried it
in my tree.

I agree with Andrew's comments though - it's not nice having the dual
approach of the partial, but the complexity of the full approach is a
bit scary and buys you little in real terms (performance and space).
I still believe that creating an "address_space like structure" for
anon memory, shared across VMAs is an idea that might give us cleaner
code - it also fixes other problems like Andi's NUMA API binding.

> We can write a testcase ourself, it's pretty easy, just create a 2.7G
> file in /dev/shm, and mmap(MAP_SHARED) it from 1k processes and fault in
> all the pagetables from all tasks touching the shm vma. Then run a
> second copy until the machine starts swapping and see how thing goes. To
> do this you need probably 8G, this is why I didn't write the testcase
> myself yet ;).  maybe I can simulate with less shm and less tasks on 1G
> boxes too, but the extreme lru effects of point 3 won't be visibile
> there, the very same software configuration works fine on 1/2G boxes on
> stock 2.4. problems showsup when the lru grows due the algorithm not
> contemplating million of dirty swapcache in a row at the end of the lru
> and some gigs of free cache ad the head of the lru. the rmap-only issues
> can also be tested with math, no testcase is needed for that.

I don't have time at the moment to go write it at the moment, but I can certainly run it on large end hardware if that helps.

M.

  reply	other threads:[~2004-02-27 22:09 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-27  1:33 2.4.23aa2 (bugfixes and important VM improvements for the high end) Andrea Arcangeli
2004-02-27  4:38 ` Rik van Riel
2004-02-27 17:32   ` Andrea Arcangeli
2004-02-27 19:08     ` Rik van Riel
2004-02-27 20:29       ` Andrew Morton
2004-02-27 20:49         ` Rik van Riel
2004-02-27 20:55           ` Andrew Morton
2004-02-27 21:28           ` Andrea Arcangeli
2004-02-27 21:37             ` Andrea Arcangeli
2004-02-28  3:22             ` Andrea Arcangeli
2004-03-01 11:10           ` Nikita Danilov
2004-02-27 21:15         ` Andrea Arcangeli
2004-02-27 22:03           ` Martin J. Bligh [this message]
2004-02-27 22:23             ` Andrew Morton
2004-02-28  2:32             ` Andrea Arcangeli
2004-02-28  4:57               ` Wim Coekaerts
2004-02-28  6:18                 ` Andrea Arcangeli
2004-02-28  6:45                   ` Martin J. Bligh
2004-02-28  7:05                     ` Andrea Arcangeli
2004-02-28  9:19                       ` Dave Hansen
2004-03-18  2:44                         ` Andrea Arcangeli
     [not found]                   ` <20040228061838.GO8834@dualathlon.random.suse.lists.linux.kernel>
2004-02-28 12:46                     ` Andi Kleen
2004-02-29  1:39                       ` Andrea Arcangeli
2004-02-29  2:29                         ` Andi Kleen
2004-02-29 16:34                           ` Andrea Arcangeli
2004-02-28  6:10               ` Martin J. Bligh
2004-02-28  6:43                 ` Andrea Arcangeli
2004-02-28  7:00                   ` Martin J. Bligh
2004-02-28  7:29                     ` Andrea Arcangeli
2004-02-28 14:55                       ` Rik van Riel
2004-02-28 15:06                         ` Arjan van de Ven
2004-02-29  1:43                         ` Andrea Arcangeli
     [not found]                           ` < 1078370073.3403.759.camel@abyss.local>
2004-03-04  3:14                           ` Peter Zaitsev
2004-03-04  3:33                             ` Andrew Morton
2004-03-04  3:44                               ` Peter Zaitsev
2004-03-04  4:07                                 ` Andrew Morton
2004-03-04  4:44                                   ` Peter Zaitsev
2004-03-04  4:52                                   ` Andrea Arcangeli
2004-03-04  5:10                                     ` Andrew Morton
2004-03-04  5:27                                       ` Andrea Arcangeli
2004-03-04  5:38                                         ` Andrew Morton
2004-03-05 20:19                                       ` Jamie Lokier
2004-03-05 20:33                                         ` Andrea Arcangeli
2004-03-05 21:44                                           ` Jamie Lokier
2004-03-04 12:12                                     ` Rik van Riel
2004-03-04 16:21                                     ` Peter Zaitsev
2004-03-04 18:13                                       ` Andrea Arcangeli
2004-03-04 17:35                                   ` Martin J. Bligh
2004-03-04 18:16                                     ` Andrea Arcangeli
2004-03-04 19:31                                       ` Martin J. Bligh
2004-03-04 20:21                                     ` Peter Zaitsev
2004-03-05 10:33                                 ` Ingo Molnar
2004-03-05 14:15                                   ` Andrea Arcangeli
2004-03-05 14:32                                     ` Ingo Molnar
2004-03-05 14:58                                       ` Andrea Arcangeli
2004-03-05 15:26                                         ` Ingo Molnar
2004-03-05 15:53                                           ` Andrea Arcangeli
2004-03-07  8:41                                             ` Ingo Molnar
2004-03-07 10:29                                               ` Nick Piggin
2004-03-07 17:33                                                 ` Andrea Arcangeli
2004-03-08  5:15                                                   ` Nick Piggin
2004-03-07 17:24                                               ` Andrea Arcangeli
2004-03-05 21:28                                           ` Martin J. Bligh
2004-03-05 18:42                                         ` Martin J. Bligh
2004-03-05 19:13                                           ` Andrea Arcangeli
2004-03-05 19:55                                             ` Martin J. Bligh
2004-03-05 20:29                                               ` Andrea Arcangeli
2004-03-05 20:41                                                 ` Andrew Morton
2004-03-05 21:07                                                   ` Andrea Arcangeli
2004-03-05 22:12                                                     ` Andrew Morton
2004-03-05 14:34                                     ` Ingo Molnar
2004-03-05 14:59                                       ` Andrea Arcangeli
2004-03-05 15:02                                         ` Ingo Molnar
     [not found]                                           ` <20040305150225.GA13237@elte.hu.suse.lists.linux.kernel>
2004-03-05 15:51                                             ` Andi Kleen
2004-03-05 16:23                                               ` Ingo Molnar
2004-03-05 16:39                                                 ` Andrea Arcangeli
2004-03-07  8:16                                                   ` Ingo Molnar
2004-03-10 13:21                                                 ` Andi Kleen
2004-03-05 16:42                                                   ` Andrea Arcangeli
2004-03-05 16:49                                                   ` Ingo Molnar
2004-03-05 16:58                                                     ` Andrea Arcangeli
2004-03-05 20:11                                           ` Jamie Lokier
2004-03-06  5:12                                             ` Jamie Lokier
2004-03-06 12:56                                               ` Magnus Naeslund(t)
2004-03-06 13:13                                                 ` Magnus Naeslund(t)
2004-03-07 11:55                                             ` Ingo Molnar
2004-03-07  6:50                                           ` Peter Zaitsev
2004-03-02  9:10                 ` Kurt Garloff
2004-03-02 15:32                   ` Martin J. Bligh
2004-02-27 21:42         ` Hugh Dickins
2004-02-27 23:18         ` Marcelo Tosatti
2004-02-27 22:39           ` Andrew Morton
2004-02-27 20:31       ` Andrea Arcangeli
2004-02-29  6:34       ` Mike Fedyk
     [not found] <20040304175821.GO4922@dualathlon.random>
2004-03-04 22:14 ` Rik van Riel
2004-03-04 23:24   ` Andrea Arcangeli
2004-03-05  3:43     ` Rik van Riel
     [not found] <1u7eQ-6Bz-1@gated-at.bofh.it>
     [not found] ` <1ue6M-45w-11@gated-at.bofh.it>
     [not found]   ` <1uofN-4Rh-25@gated-at.bofh.it>
     [not found]     ` <1vRz3-5p2-11@gated-at.bofh.it>
     [not found]       ` <1vRSn-5Fc-11@gated-at.bofh.it>
     [not found]         ` <1vS26-5On-21@gated-at.bofh.it>
     [not found]           ` <1wkUr-3QW-11@gated-at.bofh.it>
     [not found]             ` <1wolx-7ET-31@gated-at.bofh.it>
     [not found]               ` <1woEM-7Yx-41@gated-at.bofh.it>
     [not found]                 ` <1wp8b-7x-3@gated-at.bofh.it>
     [not found]                   ` <1wp8l-7x-25@gated-at.bofh.it>
     [not found]                     ` <1x0qG-Dr-3@gated-at.bofh.it>
2004-03-12 21:15                       ` Andi Kleen
2004-03-18 19:50                         ` Peter Zaitsev
     [not found]               ` <1woEJ-7Yx-25@gated-at.bofh.it>
     [not found]                 ` <1wp8c-7x-5@gated-at.bofh.it>
     [not found]                   ` <1wprd-qI-21@gated-at.bofh.it>
     [not found]                     ` <1wpUz-Tw-21@gated-at.bofh.it>
     [not found]                       ` <1x293-2nT-7@gated-at.bofh.it>
2004-03-12 21:25                         ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=162060000.1077919387@flay \
    --to=mbligh@aracnet.com \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox