From: Matt Helsley <matthltc@us.ibm.com>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: xemul@parallels.com, containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, dave@linux.vnet.ibm.com,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
mingo@elte.hu
Subject: Re: [PATCH 1/9] exec_path 1/9: introduce ->exec_path and switch /proc/*/exe
Date: Thu, 4 Jun 2009 20:49:40 -0700 [thread overview]
Message-ID: <20090605034940.GC9285@us.ibm.com> (raw)
In-Reply-To: <20090604224239.GA10666@x200.localdomain>
On Thu, Jun 04, 2009, Alexey Dobriyan wrote:
> On Thu, Jun 04, 2009 at 02:30:33PM -0700, Matt Helsley wrote:
> > On Thu, Jun 04, 2009 at 08:07:23AM -0700, Linus Torvalds wrote:
> > > On Thu, 4 Jun 2009, Matt Helsley wrote:
> > > >
> > > > Doesn't this pin the vfs mount of the executable for the lifetime
> > > > of
> > > > the task?
> > >
> > > Well, yes, but so does the current code.
> >
> > Not quite. The current code pins it as long as the corresponding VMAs
> > are mapped -- not for the lifetime of the task.
> >
> > > Sure, in _theory_ it can be a non-mmap executable (maybe people
> > > still have
> > > those old OMAGIC a.out executables), and in _theory_ you could unmap
> > > the
> > > executable even if it was originally mmap'ed, but neither of those
> > > is
> > > exactly common, are they?
> >
> > Not common to my knowledge, no.
> >
> > >
> > > So in practice, nothing has changed wrt lifetime of the executable.
> >
> > Almost all of the time, yes.
>
> And year ago executable wasn't pinned at all, so if you're opposing
> widening of time executable is pinned, you should revert your own patch
> which introduced it in first place.
You're wrong -- my patch did not widen the time the executable is pinned.
Read the exe_file changelog and give the code a more thorough read:
[The exe_file] reference would prevent the filesystem holding the executable
file from being unmounted even after unmapping the VMAs. So we track the
number of VM_EXECUTABLE VMAs and drop the new reference when the last one is
unmapped. This avoids pinning the mounted filesystem.
So long as the VMAs are mapped the filesystem is pinned. Until these VMAs are
unmapped exe_file does not extend the time that the filesystem is pinned. So
dropping the exe_file reference when the last VMA goes away should suffice in
preserving this property of the old VMA walk. The code does this by:
When exe_file is set from the bprm during exec we also set num_exe_file_vmas to 0.
When a VMA with VM_EXECUTABLE is mapped we call added_exe_file_vma() since
MAP_EXECUTABLE is only used for this (binfmt_* do_mmap() calls pass MAP_EXECUTABLE).
added_exe_file_vma() increments num_exe_file_vmas.
When a VM_EXECUTABLE VMA is split in two we call added_exe_file_vma() since there
is one more VMA.
When two VM_EXECUTABLE VMAs are merged we call removed_exe_file_vma() since there
is one less VMA. See removed_exe_file_vma() below.
When a VM_EXECUTABLE VMA is unmapped we call removed_exe_file_vma() since there
is one less VMA. When the last VM_EXECUTABLE VMA is unmapped num_exe_file_vmas
drops to 0 and we trigger fput(exe_file):
void removed_exe_file_vma(struct mm_struct *mm)
{
mm->num_exe_file_vmas--;
if ((mm->num_exe_file_vmas == 0) && mm->exe_file){
fput(mm->exe_file);
mm->exe_file = NULL;
}
}
which avoids pinning the VFS mount after the last VM_EXECUTABLE VMA is unmapped.
This is what used to happen with the old VMA-walk code too -- when the last
VM_EXECUTABLE VMA was unmapped the /proc/*/exe link disappeared (as you've noted)
and the vfs mount was no longer pinned by the /proc/*/exe code.
In case you're curious here's what Al said when my original patch lacked this code
and hence ignored mount pinning:
http://lkml.org/lkml/2007/7/12/398
So I added the VMA tracking code.
Since Linus seems to think vfs mount pinning is inconsequential
I don't mind dropping this objection. Please feel free to add my:
Acked-by: Matt Helsley <matthltc@us.ibm.com>
to this entire series.
>
> ->exec_path merely makes /proc/*/exe very unheuristical (binfmt loader
> decides, nothing else) and suitable for other code as demonstrated.
Yes, the VMA walk that exe_file replaced was a bit of a heuristic.
Cheers,
-Matt Helsley
next prev parent reply other threads:[~2009-06-05 3:49 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20090526113618.GJ28083@us.ibm.com>
[not found] ` <20090526162415.fb9cefef.akpm@linux-foundation.org>
[not found] ` <20090531215427.GA29534@x200.localdomain>
[not found] ` <20090531151953.8f8b14b5.akpm@linux-foundation.org>
[not found] ` <20090531151953.8f8b14b5.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-06-03 23:04 ` [PATCH 1/9] exec_path 1/9: introduce ->exec_path and switch /proc/*/exe Alexey Dobriyan
2009-06-03 23:05 ` [PATCH 2/9] exec_path 2/9: switch audit to ->exec_path Alexey Dobriyan
[not found] ` <20090603230422.GB853-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-06-03 23:05 ` [PATCH 3/9] exec_path 3/9: switch TOMOYO " Alexey Dobriyan
2009-06-03 23:06 ` [PATCH 4/9] exec_path 4/9: switch oprofile " Alexey Dobriyan
2009-06-03 23:06 ` [PATCH 6/9] exec_path 6/9: add struct spu::tsk Alexey Dobriyan
2009-06-03 23:06 ` [PATCH 5/9] exec_path 5/9: make struct spu_context::owner task_struct Alexey Dobriyan
2009-06-03 23:07 ` [PATCH 7/9] exec_path 7/9: switch cell SPU thing to ->exec_path Alexey Dobriyan
2009-06-03 23:07 ` [PATCH 8/9] exec_path 8/9: remove ->exe_file et al Alexey Dobriyan
2009-06-03 23:08 ` [PATCH 9/9] exec_path 9/9: remove VM_EXECUTABLE Alexey Dobriyan
2009-06-04 7:24 ` Matt Helsley
2009-06-03 23:36 ` [PATCH 1/9] exec_path 1/9: introduce ->exec_path and switch /proc/*/exe Linus Torvalds
2009-06-04 7:55 ` Matt Helsley
2009-06-04 8:10 ` Matt Helsley
2009-06-04 15:07 ` Linus Torvalds
2009-06-04 21:30 ` Matt Helsley
2009-06-04 22:42 ` Alexey Dobriyan
2009-06-05 3:49 ` Matt Helsley [this message]
2009-06-05 10:45 ` Christoph Hellwig
2009-06-05 15:10 ` Linus Torvalds
2009-06-05 15:41 ` Alexey Dobriyan
2009-06-05 15:49 ` Linus Torvalds
2009-06-05 16:09 ` Alexey Dobriyan
[not found] ` <20090605160943.GA5262-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-06-05 16:48 ` Linus Torvalds
2009-06-05 17:46 ` Alexey Dobriyan
2009-06-06 7:22 ` Al Viro
2009-06-15 22:10 ` Alexey Dobriyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090605034940.GC9285@us.ibm.com \
--to=matthltc@us.ibm.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).