From: Peter Zijlstra <peterz@infradead.org>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
David Miller <davem@davemloft.net>, Ollie Wild <aaw@google.com>,
Rik van Riel <riel@redhat.com>,
viro@ftp.linux.org.uk, linux-arch@vger.kernel.org,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCHSET] mremap/mmap mess
Date: Wed, 09 Dec 2009 14:37:42 +0100 [thread overview]
Message-ID: <1260365863.5489.540.camel@laptop> (raw)
In-Reply-To: <Pine.LNX.4.64.0912091242010.26035@sister.anvils>
On Wed, 2009-12-09 at 13:12 +0000, Hugh Dickins wrote:
> On Wed, 9 Dec 2009, Peter Zijlstra wrote:
> > /me ponders.. doesn't the binfmt engine cruft need the args in place in
> > order to execute?
>
> Hardly looked, Al will be more up to date with all the grisly details.
>
> The "binfmt engine cruft" being search_binary_handler()? I think the
> args have to be "ready to go" before that, but that's different from
> the new mm actually being used as an mm before that. It used not to
> be used early, but from 2.6.23 on it is used early, via get_user_pages.
Yeah, explicitly the fn() call in there which will mostly land you
load_elf_binary(). After that I loose track.
> > That is, IIRC the problem is that you need to have the argc/env in place
> > for the binfmt engine thing, and need to have ran the binfmt engine
> > thing before you know the personality.
>
> It is a problem that personality is discovered late in the sequence,
> and that is a considerable part of what Al is up against.
>
> >
> > As to your idea, if that were feasible we could do without the copy and
> > simply steal the pages directly from the old mm.
>
> Perhaps, but I think that would lead to a gradual accumulation
> of more and more pages pinned in memory by scattered references.
Well, IF the binfmt stuff can deal with the arrays being in the old mm
then it doesn't need to pin them I think, but I really don't know how
all this binfmt stuff works.
Reading fs/binfmt_elf.c:load_elf_binary() it looks like there might be a
spot where the personality is know and we still have the old mm around,
maybe we can hook in there -- we'd need to visit all binfmt though..
If we can make the binfmt stuff pass the correct location to
flush_old_exec() we could do the copy there.
> I Cc'ed you really because I wasn't much involved in the variable
> length arg discussions, and don't remember how important swappability
> was viewed at the time. It is a significant feature of what you and
> Ollie ended up with, so I'm guessing it was then viewed as essential.
> That would be my view.
>
> But now it's suggested that the TLB+cache effects of using an mm there
> are counter-productive, and better to forget swappability: well, I want
> to keep it, and Al is making a brave effort to hold on to it, but I'm
> wary of the weirdness involved.
Right, the swappability is key, without that you can easily run the
kernel into the ground if you don't have a limit on the argv/env arrays.
And not having a limit was the whole point.
next prev parent reply other threads:[~2009-12-09 13:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-05 19:08 [RFC][PATCHSET] mremap/mmap mess Al Viro
2009-12-05 20:44 ` Linus Torvalds
2009-12-05 23:01 ` Al Viro
2009-12-05 23:58 ` Russell King
2009-12-06 17:22 ` Hugh Dickins
2009-12-06 18:00 ` Linus Torvalds
2009-12-07 3:58 ` Al Viro
2009-12-07 18:58 ` Hugh Dickins
2009-12-07 19:30 ` Al Viro
2009-12-07 20:05 ` Hugh Dickins
2009-12-08 6:07 ` Al Viro
2009-12-08 11:42 ` Hugh Dickins
2009-12-08 13:03 ` Hugh Dickins
2009-12-08 21:08 ` David Miller
2009-12-08 22:06 ` Al Viro
2009-12-09 11:43 ` Hugh Dickins
2009-12-09 12:21 ` Peter Zijlstra
2009-12-09 13:12 ` Hugh Dickins
2009-12-09 13:37 ` Peter Zijlstra [this message]
2009-12-09 13:24 ` Al Viro
2009-12-09 13:39 ` Peter Zijlstra
2009-12-09 13:46 ` Al Viro
2009-12-09 14:36 ` Hugh Dickins
2009-12-09 15:12 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1260365863.5489.540.camel@laptop \
--to=peterz@infradead.org \
--cc=aaw@google.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox