From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>, Eric Paris <eparis@redhat.com>,
"linux-security-module@vger.kernel.org"
<linux-security-module@vger.kernel.org>,
"oprofile-list@lists.sf.net" <oprofile-list@lists.sf.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
Date: Tue, 03 Apr 2012 09:06:12 +0400 [thread overview]
Message-ID: <4F7A8544.2020603@openvz.org> (raw)
In-Reply-To: <20120402231837.GC32299@count0.beaverton.ibm.com>
Matt Helsley wrote:
> On Sat, Mar 31, 2012 at 01:29:29PM +0400, Konstantin Khlebnikov wrote:
>> Currently the kernel sets mm->exe_file during sys_execve() and then tracks
>> number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon as
>> this counter drops to zero kernel resets mm->exe_file to NULL. Plus it resets
>> mm->exe_file at last mmput() when mm->mm_users drops to zero.
>>
>> Vma with VM_EXECUTABLE flag appears after mapping file with flag MAP_EXECUTABLE,
>> such vmas can appears only at sys_execve() or after vma splitting, because
>> sys_mmap ignores this flag. Usually binfmt module sets mm->exe_file and mmaps
>> some executable vmas with this file, they hold mm->exe_file while task is running.
>>
>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
>> where all this stuff was introduced:
>>
>>> The kernel implements readlink of /proc/pid/exe by getting the file from
>>> the first executable VMA. Then the path to the file is reconstructed and
>>> reported as the result.
>>>
>>> Because of the VMA walk the code is slightly different on nommu systems.
>>> This patch avoids separate /proc/pid/exe code on nommu systems. Instead of
>>> walking the VMAs to find the first executable file-backed VMA we store a
>>> reference to the exec'd file in the mm_struct.
>>>
>>> That reference would prevent the filesystem holding the executable file
>>> from being unmounted even after unmapping the VMAs. So we track the number
>>> of VM_EXECUTABLE VMAs and drop the new reference when the last one is
>>> unmapped. This avoids pinning the mounted filesystem.
>>
>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
>> fix some hypothetical pinning fs from umounting by mm which already unmapped all
>> its executable files, but still alive. Does anyone know any real world example?
>> mm can be borrowed by swapoff or some get_task_mm() user, but it's not a big problem.
>>
>> Thus, we can remove all this stuff together with VM_EXECUTABLE flag and
>> keep mm->exe_file alive till final mmput().
>>
>> After that we can access current->mm->exe_file without any locks
>> (after checking current->mm and mm->exe_file for NULL)
>>
>> Some code around security and oprofile still uses VM_EXECUTABLE for retrieving
>> task's executable file, after this patch they will use mm->exe_file directly.
>> In tomoyo and audit mm is always current->mm, oprofile uses get_task_mm().
>
> Perhaps I'm missing something but it seems like you ought to split
> this into two patches. The first could fix up the cell, tile, etc. arch
> code to use the exe_file reference rather than walk the VMAs. Then the
> second patch could remove the unusual logic used to allow userspace to unpin
> the mount and we could continue to discuss that separately. It would
> also make the git log somewhat cleaner I think...
Ok, I'll resend this patch as independent patch-set,
anyway I need to return mm->mmap_sem locking back.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>, Eric Paris <eparis@redhat.com>,
"linux-security-module@vger.kernel.org"
<linux-security-module@vger.kernel.org>,
"oprofile-list@lists.sf.net" <oprofile-list@lists.sf.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
Date: Tue, 03 Apr 2012 09:06:12 +0400 [thread overview]
Message-ID: <4F7A8544.2020603@openvz.org> (raw)
In-Reply-To: <20120402231837.GC32299@count0.beaverton.ibm.com>
Matt Helsley wrote:
> On Sat, Mar 31, 2012 at 01:29:29PM +0400, Konstantin Khlebnikov wrote:
>> Currently the kernel sets mm->exe_file during sys_execve() and then tracks
>> number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon as
>> this counter drops to zero kernel resets mm->exe_file to NULL. Plus it resets
>> mm->exe_file at last mmput() when mm->mm_users drops to zero.
>>
>> Vma with VM_EXECUTABLE flag appears after mapping file with flag MAP_EXECUTABLE,
>> such vmas can appears only at sys_execve() or after vma splitting, because
>> sys_mmap ignores this flag. Usually binfmt module sets mm->exe_file and mmaps
>> some executable vmas with this file, they hold mm->exe_file while task is running.
>>
>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
>> where all this stuff was introduced:
>>
>>> The kernel implements readlink of /proc/pid/exe by getting the file from
>>> the first executable VMA. Then the path to the file is reconstructed and
>>> reported as the result.
>>>
>>> Because of the VMA walk the code is slightly different on nommu systems.
>>> This patch avoids separate /proc/pid/exe code on nommu systems. Instead of
>>> walking the VMAs to find the first executable file-backed VMA we store a
>>> reference to the exec'd file in the mm_struct.
>>>
>>> That reference would prevent the filesystem holding the executable file
>>> from being unmounted even after unmapping the VMAs. So we track the number
>>> of VM_EXECUTABLE VMAs and drop the new reference when the last one is
>>> unmapped. This avoids pinning the mounted filesystem.
>>
>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
>> fix some hypothetical pinning fs from umounting by mm which already unmapped all
>> its executable files, but still alive. Does anyone know any real world example?
>> mm can be borrowed by swapoff or some get_task_mm() user, but it's not a big problem.
>>
>> Thus, we can remove all this stuff together with VM_EXECUTABLE flag and
>> keep mm->exe_file alive till final mmput().
>>
>> After that we can access current->mm->exe_file without any locks
>> (after checking current->mm and mm->exe_file for NULL)
>>
>> Some code around security and oprofile still uses VM_EXECUTABLE for retrieving
>> task's executable file, after this patch they will use mm->exe_file directly.
>> In tomoyo and audit mm is always current->mm, oprofile uses get_task_mm().
>
> Perhaps I'm missing something but it seems like you ought to split
> this into two patches. The first could fix up the cell, tile, etc. arch
> code to use the exe_file reference rather than walk the VMAs. Then the
> second patch could remove the unusual logic used to allow userspace to unpin
> the mount and we could continue to discuss that separately. It would
> also make the git log somewhat cleaner I think...
Ok, I'll resend this patch as independent patch-set,
anyway I need to return mm->mmap_sem locking back.
next prev parent reply other threads:[~2012-04-03 5:06 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-31 9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
2012-03-31 9:25 ` Konstantin Khlebnikov
2012-03-31 9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 17:09 ` [PATCH 1/7 v2] " Konstantin Khlebnikov
2012-03-31 17:09 ` Konstantin Khlebnikov
2012-04-03 0:46 ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
2012-04-03 0:46 ` Suresh Siddha
2012-04-03 0:46 ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
2012-04-03 0:46 ` Suresh Siddha
2012-04-03 5:37 ` Konstantin Khlebnikov
2012-04-03 5:37 ` Konstantin Khlebnikov
2012-04-03 23:31 ` Suresh Siddha
2012-04-04 4:43 ` Konstantin Khlebnikov
2012-04-04 4:43 ` Konstantin Khlebnikov
2012-04-05 11:56 ` Konstantin Khlebnikov
2012-04-05 11:56 ` Konstantin Khlebnikov
2012-04-06 0:01 ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
2012-04-06 0:01 ` Suresh Siddha
2012-04-06 0:01 ` [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
2012-04-06 0:01 ` Suresh Siddha
2012-04-06 0:01 ` [v3 VM_PAT PATCH 2/3] x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn Suresh Siddha
2012-04-06 0:01 ` Suresh Siddha
2012-04-06 0:01 ` [v3 VM_PAT PATCH 3/3] mm, x86, PAT: rework linear pfn-mmap tracking Suresh Siddha
2012-04-06 0:01 ` Suresh Siddha
2012-04-03 0:46 ` [x86 PAT PATCH 2/2] " Suresh Siddha
2012-04-03 0:46 ` Suresh Siddha
2012-04-03 5:48 ` Konstantin Khlebnikov
2012-04-03 5:48 ` Konstantin Khlebnikov
2012-04-03 5:55 ` Konstantin Khlebnikov
2012-04-03 5:55 ` Konstantin Khlebnikov
2012-04-03 6:03 ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Konstantin Khlebnikov
2012-04-03 6:03 ` Konstantin Khlebnikov
2012-04-03 23:14 ` Suresh Siddha
2012-04-03 23:14 ` Suresh Siddha
2012-04-04 4:40 ` Konstantin Khlebnikov
2012-04-04 4:40 ` Konstantin Khlebnikov
2012-03-31 9:29 ` [PATCH 2/7] mm: introduce vma flag VM_ARCH_1 Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 22:25 ` Benjamin Herrenschmidt
2012-03-31 22:25 ` Benjamin Herrenschmidt
2012-03-31 9:29 ` [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 17:01 ` Linus Torvalds
2012-03-31 17:01 ` Linus Torvalds
2012-03-31 9:29 ` [PATCH 4/7] mm: kill vma flag VM_INSERTPAGE Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 9:29 ` [PATCH 5/7] mm, drm/udl: fixup vma flags on mmap Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 20:13 ` Oleg Nesterov
2012-03-31 20:13 ` Oleg Nesterov
2012-03-31 20:39 ` Cyrill Gorcunov
2012-03-31 20:39 ` Cyrill Gorcunov
2012-04-02 9:46 ` Konstantin Khlebnikov
2012-04-02 9:46 ` Konstantin Khlebnikov
2012-04-02 9:54 ` Cyrill Gorcunov
2012-04-02 9:54 ` Cyrill Gorcunov
2012-04-02 10:13 ` Konstantin Khlebnikov
2012-04-02 10:13 ` Konstantin Khlebnikov
2012-04-02 14:48 ` Oleg Nesterov
2012-04-02 14:48 ` Oleg Nesterov
2012-04-02 16:02 ` Cyrill Gorcunov
2012-04-02 16:02 ` Cyrill Gorcunov
2012-04-02 16:19 ` Konstantin Khlebnikov
2012-04-02 16:19 ` Konstantin Khlebnikov
2012-04-02 16:27 ` Cyrill Gorcunov
2012-04-02 16:27 ` Cyrill Gorcunov
2012-04-02 17:14 ` Konstantin Khlebnikov
2012-04-02 18:05 ` Cyrill Gorcunov
2012-04-02 18:05 ` Cyrill Gorcunov
2012-04-02 23:04 ` Matt Helsley
2012-04-02 23:04 ` Matt Helsley
2012-04-03 5:10 ` Konstantin Khlebnikov
2012-04-03 5:10 ` Konstantin Khlebnikov
2012-04-03 18:16 ` Matt Helsley
2012-04-03 18:16 ` Matt Helsley
2012-04-03 19:32 ` Cyrill Gorcunov
2012-04-03 19:32 ` Cyrill Gorcunov
2012-04-05 20:29 ` Matt Helsley
2012-04-05 20:29 ` Matt Helsley
2012-04-05 20:53 ` Cyrill Gorcunov
2012-04-05 20:53 ` Cyrill Gorcunov
2012-04-05 21:04 ` Konstantin Khlebnikov
2012-04-05 21:04 ` Konstantin Khlebnikov
2012-04-05 21:44 ` Matt Helsley
2012-04-05 21:44 ` Matt Helsley
2012-04-05 21:55 ` Linus Torvalds
2012-04-05 21:55 ` Linus Torvalds
2012-04-06 4:36 ` Konstantin Khlebnikov
2012-04-06 4:36 ` Konstantin Khlebnikov
2012-04-02 23:18 ` Matt Helsley
2012-04-02 23:18 ` Matt Helsley
2012-04-03 5:06 ` Konstantin Khlebnikov [this message]
2012-04-03 5:06 ` Konstantin Khlebnikov
2012-04-06 22:48 ` Andrew Morton
2012-04-06 22:48 ` Andrew Morton
2012-03-31 9:29 ` [PATCH 7/7] mm: move madvise vma flags to the end Konstantin Khlebnikov
2012-03-31 9:29 ` Konstantin Khlebnikov
2012-03-31 14:06 ` [PATCH 0/7] mm: vma->vm_flags diet Andi Kleen
2012-03-31 14:06 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F7A8544.2020603@openvz.org \
--to=khlebnikov@openvz.org \
--cc=akpm@linux-foundation.org \
--cc=eparis@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-security-module@vger.kernel.org \
--cc=matthltc@us.ibm.com \
--cc=oleg@redhat.com \
--cc=oprofile-list@lists.sf.net \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.