From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx104.postini.com [74.125.245.104]) by kanga.kvack.org (Postfix) with SMTP id 0A23C6B004A for ; Thu, 5 Apr 2012 17:04:48 -0400 (EDT) Received: by bkwq16 with SMTP id q16so2129124bkw.14 for ; Thu, 05 Apr 2012 14:04:46 -0700 (PDT) Message-ID: <4F7E08EB.5070600@openvz.org> Date: Fri, 06 Apr 2012 01:04:43 +0400 From: Konstantin Khlebnikov MIME-Version: 1.0 Subject: Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE References: <20120331091049.19373.28994.stgit@zurg> <20120331092929.19920.54540.stgit@zurg> <20120331201324.GA17565@redhat.com> <20120402230423.GB32299@count0.beaverton.ibm.com> <4F7A863C.5020407@openvz.org> <20120403181631.GD32299@count0.beaverton.ibm.com> <20120403193204.GE3370@moon> <20120405202904.GB7761@count0.beaverton.ibm.com> In-Reply-To: <20120405202904.GB7761@count0.beaverton.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Matt Helsley Cc: Cyrill Gorcunov , Oleg Nesterov , "linux-mm@kvack.org" , Andrew Morton , "linux-kernel@vger.kernel.org" , Eric Paris , "linux-security-module@vger.kernel.org" , "oprofile-list@lists.sf.net" , Linus Torvalds , Al Viro Matt Helsley wrote: > On Tue, Apr 03, 2012 at 11:32:04PM +0400, Cyrill Gorcunov wrote: >> On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote: >>> On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote: >>>> Matt Helsley wrote: >>>>> On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote: >>>>>> On 03/31, Konstantin Khlebnikov wrote: >>>>>>> >>>>>>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"), >>>>>>> where all this stuff was introduced: >>>>>>> >>>>>>>> ... >>>>>>>> This avoids pinning the mounted filesystem. >>>>>>> >>>>>>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to >>>>>>> fix some hypothetical pinning fs from umounting by mm which already unmapped all >>>>>>> its executable files, but still alive. Does anyone know any real world example? >>>>>> >>>>>> This is the question to Matt. >>>>> >>>>> This is where I got the scenario: >>>>> >>>>> https://lkml.org/lkml/2007/7/12/398 >>>> >>>> Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file" >>>> gives userspace ability to unpin vfsmount explicitly. >>> >>> Doesn't that break the semantics of the kernel ABI? >> >> Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left. >> Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed >> again, until program exit. > > The prctl() interface itself is fine as it stands now. > > As far as I can tell Konstantin is proposing that we remove the unusual > counter that tracks the number of mappings of the exe_file and require > userspace use the prctl() to drop the last reference. That's what I think > will break the ABI because after that change you *must* change userspace > code to use the prctl(). It's an ABI change because the same sequence of > system calls with the same input bits produces different behavior. But common software does not require this at all. I did not found real examples, only hypothesis by Al Viro: https://lkml.org/lkml/2007/7/12/398 libhugetlbfs isn't good example too, the man proc says: /proc/[pid]/exe is alive until main thread is alive, but in case libhugetlbfs /proc/[pid]/exe disappears too early. Also I would not call it ABI, this corner-case isn't documented, I'm afraid only few people in the world knows about it =) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org