From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Emelyanov Subject: Re: [Devel] [RFC][PATCH 4/4]: Enable cloning PTY namespaces Date: Wed, 06 Feb 2008 20:06:15 +0300 Message-ID: <47A9E907.5040102@openvz.org> References: <20080206050428.GA19461@us.ibm.com> <20080206051117.GD19764@us.ibm.com> <47A9780F.1070803@openvz.org> <20080206153750.GA15351@sergelap.austin.ibm.com> <47A9D5FA.2000205@openvz.org> <20080206162557.GA16518@sergelap.austin.ibm.com> <47A9E1C6.2020607@openvz.org> <20080206170402.GA17288@sergelap.austin.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20080206170402.GA17288-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Serge E. Hallyn" Cc: Containers List-Id: containers.vger.kernel.org Serge E. Hallyn wrote: > Quoting Pavel Emelyanov (xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org): >> Serge E. Hallyn wrote: >>> Quoting Pavel Emelyanov (xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org): >>>> Serge E. Hallyn wrote: >>>>> Quoting Pavel Emelyanov (xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org): >>>>>> sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote: >>>>>>> From: Sukadev Bhattiprolu >>>>>>> Subject: [RFC][PATCH 4/4]: Enable cloning PTY namespaces >>>>>>> >>>>>>> Enable cloning PTY namespaces. >>>>>>> >>>>>>> TODO: >>>>>>> This version temporarily uses the clone flag '0x80000000' which >>>>>>> is unused in mainline atm, but used for CLONE_IO in -mm. >>>>>>> While we must extend clone() (urgently) to solve this, it hopefully >>>>>>> does not affect review of the rest of this patchset. >>>>>>> >>>>>>> Changelog: >>>>>>> - Version 0: Based on earlier versions from Serge Hallyn and >>>>>>> Matt Helsley. >>>>>>> >>>>>>> Signed-off-by: Sukadev Bhattiprolu >>>>>>> --- >>>>>>> fs/devpts/inode.c | 84 +++++++++++++++++++++++++++++++++++++++------- >>>>>>> include/linux/devpts_fs.h | 52 ++++++++++++++++++++++++++++ >>>>>>> include/linux/init_task.h | 1 >>>>>>> include/linux/nsproxy.h | 2 + >>>>>>> include/linux/sched.h | 2 + >>>>>>> kernel/fork.c | 2 - >>>>>>> kernel/nsproxy.c | 17 ++++++++- >>>>>>> 7 files changed, 146 insertions(+), 14 deletions(-) >>>>>>> >>>>>>> Index: linux-2.6.24/fs/devpts/inode.c >>>>>>> =================================================================== >>>>>>> --- linux-2.6.24.orig/fs/devpts/inode.c 2008-02-05 19:16:39.000000000 -0800 >>>>>>> +++ linux-2.6.24/fs/devpts/inode.c 2008-02-05 20:27:41.000000000 -0800 >>>>>>> @@ -25,18 +25,25 @@ >>>>>>> #define DEVPTS_SUPER_MAGIC 0x1cd1 >>>>>>> >>>>>>> extern int pty_limit; /* Config limit on Unix98 ptys */ >>>>>>> -static DEFINE_IDR(allocated_ptys); >>>>>>> static DECLARE_MUTEX(allocated_ptys_lock); >>>>>>> +static struct file_system_type devpts_fs_type; >>>>>>> + >>>>>>> +struct pts_namespace init_pts_ns = { >>>>>>> + .kref = { >>>>>>> + .refcount = ATOMIC_INIT(2), >>>>>>> + }, >>>>>>> + .allocated_ptys = IDR_INIT(init_pts_ns.allocated_ptys), >>>>>>> + .mnt = NULL, >>>>>>> +}; >>>>>>> >>>>>>> static inline struct idr *current_pts_ns_allocated_ptys(void) >>>>>>> { >>>>>>> - return &allocated_ptys; >>>>>>> + return ¤t->nsproxy->pts_ns->allocated_ptys; >>>>>>> } >>>>>>> >>>>>>> -static struct vfsmount *devpts_mnt; >>>>>>> static inline struct vfsmount *current_pts_ns_mnt(void) >>>>>>> { >>>>>>> - return devpts_mnt; >>>>>>> + return current->nsproxy->pts_ns->mnt; >>>>>>> } >>>>>>> >>>>>>> static struct { >>>>>>> @@ -59,6 +66,42 @@ static match_table_t tokens = { >>>>>>> {Opt_err, NULL} >>>>>>> }; >>>>>>> >>>>>>> +struct pts_namespace *new_pts_ns(void) >>>>>>> +{ >>>>>>> + struct pts_namespace *ns; >>>>>>> + >>>>>>> + ns = kmalloc(sizeof(*ns), GFP_KERNEL); >>>>>>> + if (!ns) >>>>>>> + return ERR_PTR(-ENOMEM); >>>>>>> + >>>>>>> + ns->mnt = kern_mount_data(&devpts_fs_type, ns); >>>>>> You create a circular references here - the namespace >>>>>> holds the vfsmnt, the vfsmnt holds a superblock, a superblock >>>>>> holds the namespace. >>>>> Hmm, yeah, good point. That was probably in my original version last >>>>> year, so my fault not Suka's. Suka, would it work to have the >>>>> sb->s_info point to the namespace but not grab a reference, than have >>>> If you don't then you may be in situation, when this devpts >>>> is mounted from userspace and in case the namespace is dead >>>> superblock will point to garbage... Superblock MUST hold the >>>> namespace :) >>> But when the ns is freed sb->s_info would be NULL. Surely the helpers >>> can be made to handle that safely? >> Hm... How do we find the proper superblock? Have a reference on >> it from the namespace? I'm afraid it will be easy to resolve the >> locking issues here. >> >> I propose another scheme - we simply don't have ANY references >> from namespace to superblock/vfsmount, but get the current >> namespace in devpts_get_sb() and put in devpts_free_sb(). > > But then it really does become impossible to use a /dev/pts from another > namespace, right? Right. I already see this from another thread :) Let's drop this one. >>>>> free_pts_ns() null out its sb->s_info, i.e. something like >>>>> >>>>> void free_pts_ns(struct kref *ns_kref) >>>>> { >>>>> struct pts_namespace *ns; >>>>> struct super_block *sb; >>>>> >>>>> ns = container_of(ns_kref, struct pts_namespace, kref); >>>>> BUG_ON(ns == &init_pts_ns); >>>>> sb = ns->mnt->mnt_sb; >>>>> >>>>> mntput(ns->mnt); >>>>> sb->s_info = NULL; >>>>> >>>>> /* >>>>> * TODO: >>>>> * idr_remove_all(&ns->allocated_ptys); introduced in >>>>> .6.23 >>>>> */ >>>>> idr_destroy(&ns->allocated_ptys); >>>>> kfree(ns); >>>>> } >>>>> >>>>> >