From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [RFC] Add option to mount only a pids subset Date: Mon, 13 Mar 2017 13:27:33 +0000 Message-ID: <20170313132732.GR29622@ZenIV.linux.org.uk> References: <20170221145746.GA31914@redhat.com> <20170306230515.GA3453@comp-core-i7-2640m-0182e6> <20170312015430.GO29622@ZenIV.linux.org.uk> <20170312021257.GP29622@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Andy Lutomirski Cc: Alexey Gladkov , Linux Kernel Mailing List , Linux API , "Kirill A. Shutemov" , Vasiliy Kulikov , "Eric W. Biederman" , Oleg Nesterov , Pavel Emelyanov , James Bottomley List-Id: linux-api@vger.kernel.org On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote: > On Sat, Mar 11, 2017 at 6:13 PM, Al Viro wrote: > > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly > > expose the full thing. And as for the lifetimes making no sense... > > note that you are simply not freeing these structures of yours. > > Try to handle that and you'll get a serious PITA all over the > > place. > > > > What are you trying to achieve, anyway? Why not add a second vfsmount > > pointer per pid_namespace and make it initialized on demand, at the > > first attempt of no-pid mount? Just have a separate no-pid instance > > created for those namespaces where it had been asked for, with > > separate superblock and dentry tree not containing anything other > > that pid-only parts + self + thread-self... > > Can't we just make procfs work like most other filesystems and have > each mount have its own superblock? If we need to do something funky > to stat() output to keep existing userspace working, I think that's > okay. First of all, most of the filesystems do *NOT* guarantee anything of that sort. And what's the point of having more instances than necessary, anyway? > As far as I can tell, proc_mnt is very nearly useless -- it seems to > be used for proc_flush_task (which claims to be purely an optimization > and could be preserved in the common case where there's only one > relevant mount) and for sysctl_binary. For the latter, we could > create proc_mnt but make actual user-initiated mounts be new > superblocks anyway. Again, what for? It won't salvage that kludge... It's not as if it had been hard to have separate pid-only instance created when asked for (and reused every time when we are asked for pid-only). What's the point of ever having more than two instances per pidns? IDGI... Folks, there is no one-to-one correspondence between mountpoints and superblocks. Not since 2000 or so. Just don't try to shove your per-superblock stuff into vfsmount; it simply won't work. If you want a separate instance for that thing, then just go ahead and have ->mount() decide which one to use (and whether to create a new one). All there is to it... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752883AbdCMN1y (ORCPT ); Mon, 13 Mar 2017 09:27:54 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:35324 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752661AbdCMN1n (ORCPT ); Mon, 13 Mar 2017 09:27:43 -0400 Date: Mon, 13 Mar 2017 13:27:33 +0000 From: Al Viro To: Andy Lutomirski Cc: Alexey Gladkov , Linux Kernel Mailing List , Linux API , "Kirill A. Shutemov" , Vasiliy Kulikov , "Eric W. Biederman" , Oleg Nesterov , Pavel Emelyanov , James Bottomley Subject: Re: [RFC] Add option to mount only a pids subset Message-ID: <20170313132732.GR29622@ZenIV.linux.org.uk> References: <20170221145746.GA31914@redhat.com> <20170306230515.GA3453@comp-core-i7-2640m-0182e6> <20170312015430.GO29622@ZenIV.linux.org.uk> <20170312021257.GP29622@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote: > On Sat, Mar 11, 2017 at 6:13 PM, Al Viro wrote: > > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly > > expose the full thing. And as for the lifetimes making no sense... > > note that you are simply not freeing these structures of yours. > > Try to handle that and you'll get a serious PITA all over the > > place. > > > > What are you trying to achieve, anyway? Why not add a second vfsmount > > pointer per pid_namespace and make it initialized on demand, at the > > first attempt of no-pid mount? Just have a separate no-pid instance > > created for those namespaces where it had been asked for, with > > separate superblock and dentry tree not containing anything other > > that pid-only parts + self + thread-self... > > Can't we just make procfs work like most other filesystems and have > each mount have its own superblock? If we need to do something funky > to stat() output to keep existing userspace working, I think that's > okay. First of all, most of the filesystems do *NOT* guarantee anything of that sort. And what's the point of having more instances than necessary, anyway? > As far as I can tell, proc_mnt is very nearly useless -- it seems to > be used for proc_flush_task (which claims to be purely an optimization > and could be preserved in the common case where there's only one > relevant mount) and for sysctl_binary. For the latter, we could > create proc_mnt but make actual user-initiated mounts be new > superblocks anyway. Again, what for? It won't salvage that kludge... It's not as if it had been hard to have separate pid-only instance created when asked for (and reused every time when we are asked for pid-only). What's the point of ever having more than two instances per pidns? IDGI... Folks, there is no one-to-one correspondence between mountpoints and superblocks. Not since 2000 or so. Just don't try to shove your per-superblock stuff into vfsmount; it simply won't work. If you want a separate instance for that thing, then just go ahead and have ->mount() decide which one to use (and whether to create a new one). All there is to it...