* [PATCHv3 0/2] ns, procfs: pid conversion between ns and showing pidns hierarchy
@ 2014-09-24 10:00 Chen Hanxiao
2014-09-24 10:00 ` [PATCHv3 1/2] procfs: show hierarchy of pid namespace Chen Hanxiao
2014-09-24 10:00 ` [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns Chen Hanxiao
0 siblings, 2 replies; 11+ messages in thread
From: Chen Hanxiao @ 2014-09-24 10:00 UTC (permalink / raw)
To: containers, linux-kernel
Cc: Serge Hallyn, Eric W. Biederman, Oleg Nesterov, David Howells,
Richard Weinberger, Pavel Emelyanov, Vasiliy Kulikov,
Mateusz Guzik
This series will expose pid inside containers
via procfs.
Also show the hierarchy of pid namespcae.
Then we could know how pid looks inside a container
and their ns relationships.
v3: fix a race issue and memory leak issue
in pidns_hierarchy.
v2: use a procfs text file instead of dirs
to show the hierarchy of pid namespace
Chen Hanxiao (2):
procfs: show hierarchy of pid namespace
/proc/PID/status: show all sets of pid according to ns
fs/proc/Kconfig | 6 ++
fs/proc/Makefile | 1 +
fs/proc/array.c | 17 ++++
fs/proc/pidns_hierarchy.c | 227 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 251 insertions(+)
create mode 100644 fs/proc/pidns_hierarchy.c
--
1.9.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCHv3 1/2] procfs: show hierarchy of pid namespace 2014-09-24 10:00 [PATCHv3 0/2] ns, procfs: pid conversion between ns and showing pidns hierarchy Chen Hanxiao @ 2014-09-24 10:00 ` Chen Hanxiao 2014-09-24 17:45 ` Mateusz Guzik 2014-09-24 10:00 ` [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns Chen Hanxiao 1 sibling, 1 reply; 11+ messages in thread From: Chen Hanxiao @ 2014-09-24 10:00 UTC (permalink / raw) To: containers, linux-kernel Cc: Serge Hallyn, Eric W. Biederman, Oleg Nesterov, David Howells, Richard Weinberger, Pavel Emelyanov, Vasiliy Kulikov, Mateusz Guzik This patch will show the hierarchy of pid namespace by /proc/pidns_hierarchy like: [root@localhost ~]#cat /proc/pidns_hierarchy /proc/18060/ns/pid /proc/18102/ns/pid /proc/1534/ns/pid /proc/18060/ns/pid /proc/18102/ns/pid /proc/1600/ns/pid /proc/1550/ns/pid It shows the pid hierarchy below: init_pid_ns (not showed in /proc/pidns_hierarchy) │ ┌──────────────┐ ns1 ns2 │ │ /proc/1550/ns/pid /proc/18060/ns/pid │ │ ns3 │ /proc/18102/ns/pid │ ┌──────────┒ ns4 ns5 │ │ /proc/1534/ns/pid /proc/1600/ns/pid Every pid printed in pidns_hierarchy is the init pid of that pid ns level. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- v3: fix a race issue and memory leak issue v2: use a procfs text file instead of dirs under /proc fs/proc/Kconfig | 6 ++ fs/proc/Makefile | 1 + fs/proc/pidns_hierarchy.c | 227 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 234 insertions(+) create mode 100644 fs/proc/pidns_hierarchy.c diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig index 2183fcf..e2e2292 100644 --- a/fs/proc/Kconfig +++ b/fs/proc/Kconfig @@ -71,3 +71,9 @@ config PROC_PAGE_MONITOR /proc/pid/smaps, /proc/pid/clear_refs, /proc/pid/pagemap, /proc/kpagecount, and /proc/kpageflags. Disabling these interfaces will reduce the size of the kernel by approximately 4kb. + +config PROC_PID_HIERARCHY + bool "Enable /proc/pidns_hierarchy support" if EXPERT + depends on PROC_FS + help + Show pid namespace hierarchy information diff --git a/fs/proc/Makefile b/fs/proc/Makefile index 7151ea4..33e384b 100644 --- a/fs/proc/Makefile +++ b/fs/proc/Makefile @@ -30,3 +30,4 @@ proc-$(CONFIG_PROC_KCORE) += kcore.o proc-$(CONFIG_PROC_VMCORE) += vmcore.o proc-$(CONFIG_PRINTK) += kmsg.o proc-$(CONFIG_PROC_PAGE_MONITOR) += page.o +proc-$(CONFIG_PROC_PID_HIERARCHY) += pidns_hierarchy.o diff --git a/fs/proc/pidns_hierarchy.c b/fs/proc/pidns_hierarchy.c new file mode 100644 index 0000000..8a73095 --- /dev/null +++ b/fs/proc/pidns_hierarchy.c @@ -0,0 +1,227 @@ +#include <linux/init.h> +#include <linux/errno.h> +#include <linux/proc_fs.h> +#include <linux/module.h> +#include <linux/list.h> +#include <linux/slab.h> +#include <linux/pid_namespace.h> +#include <linux/seq_file.h> +#include <linux/mutex.h> + +/* + * /proc/pidns_hierarchy + * + * show the hierarchy of pid namespace + */ + +#define NS_HIERARCHY "pidns_hierarchy" + +static LIST_HEAD(pidns_list); +static LIST_HEAD(pidns_tree); +static DEFINE_MUTEX(pidns_list_lock); + +/* list for host pid collection */ +struct pidns_list { + struct list_head list; + struct pid *pid; +}; + +static void free_pidns_list(struct list_head *head) +{ + struct pidns_list *tmp, *pos; + + list_for_each_entry_safe(pos, tmp, head, list) { + list_del(&pos->list); + kfree(pos); + } +} + +/* + * Only add init pid of different namespaces + */ +static int +pidns_list_really_add(struct pid *pid, struct list_head *list_head) +{ + struct pidns_list *pos; + + if (!is_child_reaper(pid)) + return 0; + + list_for_each_entry(pos, list_head, list) + if (ns_of_pid(pid) == ns_of_pid(pos->pid)) + return 0; + + return 1; +} + +static int +pidns_list_add(struct pid *pid, struct list_head *list_head) +{ + struct pidns_list *ent; + + if (pidns_list_really_add(pid, list_head)) { + ent = kmalloc(sizeof(*ent), GFP_ATOMIC); + if (!ent) + return -ENOMEM; + + ent->pid = pid; + list_add_tail(&ent->list, list_head); + } + + return 0; +} + +static int +pidns_list_filter(void) +{ + struct pidns_list *pos, *pos_t; + struct pid_namespace *ns0, *ns1; + struct pid *pid0, *pid1; + int flag = 0; + int rc; + + /* screen pid with relationship + * in pidns_list, we may add pids like + * ns0 ns1 ns2 + * pid1->pid2->pid3 + * we should screen pid1, pid2 and keep pid3 + */ + list_for_each_entry(pos, &pidns_list, list) { + list_for_each_entry(pos_t, &pidns_list, list) { + flag = 0; + pid0 = pos->pid; + pid1 = pos_t->pid; + ns0 = pid0->numbers[pid0->level].ns; + ns1 = pid1->numbers[pid1->level].ns; + if (pos->pid->level < pos_t->pid->level) + for (; ns1 != NULL; ns1 = ns1->parent) + if (ns0 == ns1) { + flag = 1; + break; + } + if (flag == 1) + break; + } + + if (flag == 0) { + rc = pidns_list_add(pos->pid, &pidns_tree); + if (rc) + goto out; + } + } + + /* Now all usefull stuff are in pidns_tree, free pidns_list*/ + free_pidns_list(&pidns_list); + + return 0; + +out: + free_pidns_list(&pidns_tree); + return rc; +} + +/* collect pids in pidns_list, + * then remove duplicated ones, + * add the rest to pidns_tree + */ +static int proc_pidns_list_refresh(void) +{ + struct pid *pid; + struct task_struct *p; + int rc; + + /* collect pid in differet ns */ + rcu_read_lock(); + for_each_process(p) { + pid = task_pid(p); + if (pid && (pid->level > 0)) { + rc = pidns_list_add(pid, &pidns_list); + if (rc) + goto out; + } + } + + /* screen duplicate pids from list pidns_list + * and form a new list pidns_tree + */ + rc = pidns_list_filter(); + if (rc) + goto out; + rcu_read_unlock(); + + return 0; + +out: + free_pidns_list(&pidns_list); + rcu_read_unlock(); + return rc; +} + +static int nslist_proc_show(struct seq_file *m, void *v) +{ + struct pidns_list *pos; + struct pid_namespace *ns, *curr_ns; + struct pid *pid; + char pid_buf[32]; + int i, curr_level; + int rc; + + curr_ns = task_active_pid_ns(current); + + mutex_lock(&pidns_list_lock); + rc = proc_pidns_list_refresh(); + if (rc) { + mutex_unlock(&pidns_list_lock); + return rc; + } + + /* print pid namespace hierarchy */ + list_for_each_entry(pos, &pidns_tree, list) { + pid = pos->pid; + curr_level = -1; + ns = pid->numbers[pid->level].ns; + /* Check whether a pid has relationship with current ns */ + for (; ns != NULL; ns = ns->parent) + if (ns == curr_ns) + curr_level = curr_ns->level; + + if (curr_level == -1) + continue; + + for (i = curr_level + 1; i <= pid->level; i++) { + ns = pid->numbers[i].ns; + /* show PID '1' in specific pid ns */ + snprintf(pid_buf, 32, "/proc/%u/ns/pid", + pid_vnr(find_pid_ns(1, ns))); + seq_printf(m, "%s ", pid_buf); + } + + seq_putc(m, '\n'); + } + + free_pidns_list(&pidns_tree); + mutex_unlock(&pidns_list_lock); + + return 0; +} + +static int nslist_proc_open(struct inode *inode, struct file *file) +{ + return single_open(file, nslist_proc_show, NULL); +} + +static const struct file_operations proc_nspid_nslist_fops = { + .open = nslist_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int __init pidns_hierarchy_init(void) +{ + proc_create(NS_HIERARCHY, S_IWUGO, + NULL, &proc_nspid_nslist_fops); + + return 0; +} +fs_initcall(pidns_hierarchy_init); -- 1.9.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCHv3 1/2] procfs: show hierarchy of pid namespace 2014-09-24 10:00 ` [PATCHv3 1/2] procfs: show hierarchy of pid namespace Chen Hanxiao @ 2014-09-24 17:45 ` Mateusz Guzik 2014-09-25 9:45 ` Chen, Hanxiao 0 siblings, 1 reply; 11+ messages in thread From: Mateusz Guzik @ 2014-09-24 17:45 UTC (permalink / raw) To: Chen Hanxiao Cc: containers, linux-kernel, Serge Hallyn, Eric W. Biederman, Oleg Nesterov, David Howells, Richard Weinberger, Pavel Emelyanov, Vasiliy Kulikov On Wed, Sep 24, 2014 at 06:00:26PM +0800, Chen Hanxiao wrote: > +static int > +pidns_list_filter(void) > +{ > + struct pidns_list *pos, *pos_t; > + struct pid_namespace *ns0, *ns1; > + struct pid *pid0, *pid1; > + int flag = 0; > + int rc; > + > + /* screen pid with relationship > + * in pidns_list, we may add pids like > + * ns0 ns1 ns2 > + * pid1->pid2->pid3 > + * we should screen pid1, pid2 and keep pid3 > + */ > + list_for_each_entry(pos, &pidns_list, list) { > + list_for_each_entry(pos_t, &pidns_list, list) { In the previous thread I tried to note this will be terribly inefficient to use and adding a list of children to pid_namespace struct would deal with the problem. > + flag = 0; > + pid0 = pos->pid; > + pid1 = pos_t->pid; > + ns0 = pid0->numbers[pid0->level].ns; > + ns1 = pid1->numbers[pid1->level].ns; > + if (pos->pid->level < pos_t->pid->level) > + for (; ns1 != NULL; ns1 = ns1->parent) > + if (ns0 == ns1) { > + flag = 1; > + break; > + } > + if (flag == 1) > + break; > + } > + > + if (flag == 0) { > + rc = pidns_list_add(pos->pid, &pidns_tree); > + if (rc) > + goto out; > + } > + } > + > + /* Now all usefull stuff are in pidns_tree, free pidns_list*/ > + free_pidns_list(&pidns_list); > + > + return 0; > + > +out: > + free_pidns_list(&pidns_tree); > + return rc; > +} > + > +/* collect pids in pidns_list, > + * then remove duplicated ones, > + * add the rest to pidns_tree > + */ > +static int proc_pidns_list_refresh(void) > +{ > + struct pid *pid; > + struct task_struct *p; > + int rc; > + > + /* collect pid in differet ns */ > + rcu_read_lock(); > + for_each_process(p) { > + pid = task_pid(p); > + if (pid && (pid->level > 0)) { > + rc = pidns_list_add(pid, &pidns_list); > + if (rc) > + goto out; > + } > + } > + > + /* screen duplicate pids from list pidns_list > + * and form a new list pidns_tree > + */ > + rc = pidns_list_filter(); > + if (rc) > + goto out; > + rcu_read_unlock(); > + > + return 0; > + > +out: > + free_pidns_list(&pidns_list); > + rcu_read_unlock(); > + return rc; > +} > + > +static int nslist_proc_show(struct seq_file *m, void *v) > +{ > + struct pidns_list *pos; > + struct pid_namespace *ns, *curr_ns; > + struct pid *pid; > + char pid_buf[32]; > + int i, curr_level; > + int rc; > + > + curr_ns = task_active_pid_ns(current); > + > + mutex_lock(&pidns_list_lock); > + rc = proc_pidns_list_refresh(); > + if (rc) { > + mutex_unlock(&pidns_list_lock); > + return rc; > + } > + > + /* print pid namespace hierarchy */ > + list_for_each_entry(pos, &pidns_tree, list) { What keeps pid_namespace's safe to use? Similarly to previous patch, here we hit a place where the code is not protected with rcu and structures were just plugged into the list. Recreating the list for each open seems quite unnecessary as well. One could work around that by caching generated output and having a generation counter for namespaces to know whether the content is stale. But that still does not seem right. It looks like in the original thread someone suggested hooking this up under proc as a directory tree which sounds much better to me. Just my $0,03. > + pid = pos->pid; > + curr_level = -1; > + ns = pid->numbers[pid->level].ns; > + /* Check whether a pid has relationship with current ns */ > + for (; ns != NULL; ns = ns->parent) > + if (ns == curr_ns) > + curr_level = curr_ns->level; > + > + if (curr_level == -1) > + continue; > + > + for (i = curr_level + 1; i <= pid->level; i++) { > + ns = pid->numbers[i].ns; > + /* show PID '1' in specific pid ns */ > + snprintf(pid_buf, 32, "/proc/%u/ns/pid", > + pid_vnr(find_pid_ns(1, ns))); > + seq_printf(m, "%s ", pid_buf); > + } > + > + seq_putc(m, '\n'); > + } > + > + free_pidns_list(&pidns_tree); > + mutex_unlock(&pidns_list_lock); > + > + return 0; > +} -- Mateusz Guzik ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCHv3 1/2] procfs: show hierarchy of pid namespace 2014-09-24 17:45 ` Mateusz Guzik @ 2014-09-25 9:45 ` Chen, Hanxiao 0 siblings, 0 replies; 11+ messages in thread From: Chen, Hanxiao @ 2014-09-25 9:45 UTC (permalink / raw) To: Mateusz Guzik Cc: containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Serge Hallyn, Eric W. Biederman, Oleg Nesterov, David Howells, Richard Weinberger, Pavel Emelyanov, Vasiliy Kulikov [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5315 bytes --] Hi, > -----Original Message----- > From: Mateusz Guzik [mailto:mguzik@redhat.com] > Sent: Thursday, September 25, 2014 1:45 AM > On Wed, Sep 24, 2014 at 06:00:26PM +0800, Chen Hanxiao wrote: > > +static int > > +pidns_list_filter(void) > > +{ > > + struct pidns_list *pos, *pos_t; > > + struct pid_namespace *ns0, *ns1; > > + struct pid *pid0, *pid1; > > + int flag = 0; > > + int rc; > > + > > + /* screen pid with relationship > > + * in pidns_list, we may add pids like > > + * ns0 ns1 ns2 > > + * pid1->pid2->pid3 > > + * we should screen pid1, pid2 and keep pid3 > > + */ > > + list_for_each_entry(pos, &pidns_list, list) { > > + list_for_each_entry(pos_t, &pidns_list, list) { > > In the previous thread I tried to note this will be terribly inefficient > to use and adding a list of children to pid_namespace struct would deal > with the problem. If that, we had to add a children list, maybe another sibling list in pid_namespace struct and maintain them. That cost too much. For this feature, I think we'd better not touch pid_namespace struct. As we need not to know the hierarchy so frequently, I think such kind of inefficient is acceptable. > > > + flag = 0; > > + pid0 = pos->pid; > > + pid1 = pos_t->pid; > > + ns0 = pid0->numbers[pid0->level].ns; > > + ns1 = pid1->numbers[pid1->level].ns; > > + if (pos->pid->level < pos_t->pid->level) > > + for (; ns1 != NULL; ns1 = ns1->parent) > > + if (ns0 == ns1) { > > + flag = 1; > > + break; > > + } > > + if (flag == 1) > > + break; > > + } > > + > > + if (flag == 0) { > > + rc = pidns_list_add(pos->pid, &pidns_tree); > > + if (rc) > > + goto out; > > + } > > + } > > + > > + /* Now all usefull stuff are in pidns_tree, free pidns_list*/ > > + free_pidns_list(&pidns_list); > > + > > + return 0; > > + > > +out: > > + free_pidns_list(&pidns_tree); > > + return rc; > > +} > > + > > +/* collect pids in pidns_list, > > + * then remove duplicated ones, > > + * add the rest to pidns_tree > > + */ > > +static int proc_pidns_list_refresh(void) > > +{ > > + struct pid *pid; > > + struct task_struct *p; > > + int rc; > > + > > + /* collect pid in differet ns */ > > + rcu_read_lock(); > > + for_each_process(p) { > > + pid = task_pid(p); > > + if (pid && (pid->level > 0)) { > > + rc = pidns_list_add(pid, &pidns_list); > > + if (rc) > > + goto out; > > + } > > + } > > + > > + /* screen duplicate pids from list pidns_list > > + * and form a new list pidns_tree > > + */ > > + rc = pidns_list_filter(); > > + if (rc) > > + goto out; > > + rcu_read_unlock(); > > + > > + return 0; > > + > > +out: > > + free_pidns_list(&pidns_list); > > + rcu_read_unlock(); > > + return rc; > > +} > > + > > +static int nslist_proc_show(struct seq_file *m, void *v) > > +{ > > + struct pidns_list *pos; > > + struct pid_namespace *ns, *curr_ns; > > + struct pid *pid; > > + char pid_buf[32]; > > + int i, curr_level; > > + int rc; > > + > > + curr_ns = task_active_pid_ns(current); > > + > > + mutex_lock(&pidns_list_lock); > > + rc = proc_pidns_list_refresh(); > > + if (rc) { > > + mutex_unlock(&pidns_list_lock); > > + return rc; > > + } > > + > > + /* print pid namespace hierarchy */ > > + list_for_each_entry(pos, &pidns_tree, list) { > > What keeps pid_namespace's safe to use? Similarly to previous patch, > here we hit a place where the code is not protected with rcu and > structures were just plugged into the list. > Will fix. All list should be protected by rcu lock. > Recreating the list for each open seems quite unnecessary as well. > > One could work around that by caching generated output and having a > generation counter for namespaces to know whether the content is stale. > But that still does not seem right. > That will bring another issue: We had to *keep* that list and update it even if we don't open pidns_hierarchy. This patch try to solve this when open /proc/pidns_hierarchy by: a) recreated the list when open b) show it in a procfs text file c) drop that list > It looks like in the original thread someone suggested hooking this up > under proc as a directory tree which sounds much better to me. Dir trees provide the same information as proc text files did: a symlink name like /proc/PID/ns/pid. Refresh dir trees needs a lot of codes too. So a procfs text file is a better choice. Thanks, - Chen > > Just my $0,03. > > > + pid = pos->pid; > > + curr_level = -1; > > + ns = pid->numbers[pid->level].ns; > > + /* Check whether a pid has relationship with current ns */ > > + for (; ns != NULL; ns = ns->parent) > > + if (ns == curr_ns) > > + curr_level = curr_ns->level; > > + > > + if (curr_level == -1) > > + continue; > > + > > + for (i = curr_level + 1; i <= pid->level; i++) { > > + ns = pid->numbers[i].ns; > > + /* show PID '1' in specific pid ns */ > > + snprintf(pid_buf, 32, "/proc/%u/ns/pid", > > + pid_vnr(find_pid_ns(1, ns))); > > + seq_printf(m, "%s ", pid_buf); > > + } > > + > > + seq_putc(m, '\n'); > > + } > > + > > + free_pidns_list(&pidns_tree); > > + mutex_unlock(&pidns_list_lock); > > + > > + return 0; > > +} > > -- > Mateusz Guzik ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-24 10:00 [PATCHv3 0/2] ns, procfs: pid conversion between ns and showing pidns hierarchy Chen Hanxiao 2014-09-24 10:00 ` [PATCHv3 1/2] procfs: show hierarchy of pid namespace Chen Hanxiao @ 2014-09-24 10:00 ` Chen Hanxiao 2014-09-25 12:52 ` Chen, Hanxiao 2014-09-26 10:20 ` Chen, Hanxiao 1 sibling, 2 replies; 11+ messages in thread From: Chen Hanxiao @ 2014-09-24 10:00 UTC (permalink / raw) To: containers, linux-kernel Cc: Serge Hallyn, Eric W. Biederman, Oleg Nesterov, David Howells, Richard Weinberger, Pavel Emelyanov, Vasiliy Kulikov, Mateusz Guzik If some issues occurred inside a container guest, host user could not know which process is in trouble just by guest pid: the users of container guest only knew the pid inside containers. This will bring obstacle for trouble shooting. This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: a) In init_pid_ns, nothing changed; b) In one pidns, will tell the pid inside containers: NStgid: 21776 5 1 NSpid: 21776 5 1 NSpgid: 21776 5 1 NSsid: 21729 1 0 ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. c) If pidns is nested, it depends on which pidns are you in. NStgid: 5 1 NSpid: 5 1 NSpgid: 5 1 NSsid: 1 0 ** Views from level 1 Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- v3: add another two fielsd: NSpgid and NSsid. v2: add two new fields: NStgid and NSpid. keep fields of Tgid and Pid unchanged for back compatibility. fs/proc/array.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/fs/proc/array.c b/fs/proc/array.c index cd3653e..c30875d 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -193,6 +193,23 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns, from_kgid_munged(user_ns, cred->egid), from_kgid_munged(user_ns, cred->sgid), from_kgid_munged(user_ns, cred->fsgid)); + seq_puts(m, "NStgid:"); + for (g = ns->level; g <= pid->level; g++) + seq_printf(m, "\t%d ", + task_tgid_nr_ns(p, pid->numbers[g].ns)); + seq_puts(m, "\nNSpid:"); + for (g = ns->level; g <= pid->level; g++) + seq_printf(m, "\t%d ", + task_pid_nr_ns(p, pid->numbers[g].ns)); + seq_puts(m, "\nNSpgid:"); + for (g = ns->level; g <= pid->level; g++) + seq_printf(m, "\t%d ", + task_pgrp_nr_ns(p, pid->numbers[g].ns)); + seq_puts(m, "\nNSsid:"); + for (g = ns->level; g <= pid->level; g++) + seq_printf(m, "\t%d ", + task_session_nr_ns(p, pid->numbers[g].ns)); + seq_putc(m, '\n'); task_lock(p); if (p->files) -- 1.9.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-24 10:00 ` [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns Chen Hanxiao @ 2014-09-25 12:52 ` Chen, Hanxiao 2014-09-26 10:20 ` Chen, Hanxiao 1 sibling, 0 replies; 11+ messages in thread From: Chen, Hanxiao @ 2014-09-25 12:52 UTC (permalink / raw) To: Eric W. Biederman, Oleg Nesterov Cc: Richard Weinberger, Serge Hallyn, David Howells, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb2312", Size: 2653 bytes --] Hi, Any comments? Thanks, - Chen > -----Original Message----- > From: containers-bounces@lists.linux-foundation.org > Subject: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns > > If some issues occurred inside a container guest, host user > could not know which process is in trouble just by guest pid: > the users of container guest only knew the pid inside containers. > This will bring obstacle for trouble shooting. > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > a) In init_pid_ns, nothing changed; > > b) In one pidns, will tell the pid inside containers: > NStgid: 21776 5 1 > NSpid: 21776 5 1 > NSpgid: 21776 5 1 > NSsid: 21729 1 0 > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > c) If pidns is nested, it depends on which pidns are you in. > NStgid: 5 1 > NSpid: 5 1 > NSpgid: 5 1 > NSsid: 1 0 > ** Views from level 1 > > Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> > --- > v3: add another two fielsd: NSpgid and NSsid. > v2: add two new fields: NStgid and NSpid. > keep fields of Tgid and Pid unchanged for back compatibility. > > fs/proc/array.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/fs/proc/array.c b/fs/proc/array.c > index cd3653e..c30875d 100644 > --- a/fs/proc/array.c > +++ b/fs/proc/array.c > @@ -193,6 +193,23 @@ static inline void task_state(struct seq_file *m, struct > pid_namespace *ns, > from_kgid_munged(user_ns, cred->egid), > from_kgid_munged(user_ns, cred->sgid), > from_kgid_munged(user_ns, cred->fsgid)); > + seq_puts(m, "NStgid:"); > + for (g = ns->level; g <= pid->level; g++) > + seq_printf(m, "\t%d ", > + task_tgid_nr_ns(p, pid->numbers[g].ns)); > + seq_puts(m, "\nNSpid:"); > + for (g = ns->level; g <= pid->level; g++) > + seq_printf(m, "\t%d ", > + task_pid_nr_ns(p, pid->numbers[g].ns)); > + seq_puts(m, "\nNSpgid:"); > + for (g = ns->level; g <= pid->level; g++) > + seq_printf(m, "\t%d ", > + task_pgrp_nr_ns(p, pid->numbers[g].ns)); > + seq_puts(m, "\nNSsid:"); > + for (g = ns->level; g <= pid->level; g++) > + seq_printf(m, "\t%d ", > + task_session_nr_ns(p, pid->numbers[g].ns)); > + seq_putc(m, '\n'); > > task_lock(p); > if (p->files) > -- > 1.9.0 > > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-24 10:00 ` [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns Chen Hanxiao 2014-09-25 12:52 ` Chen, Hanxiao @ 2014-09-26 10:20 ` Chen, Hanxiao 2014-09-29 14:00 ` Serge E. Hallyn 1 sibling, 1 reply; 11+ messages in thread From: Chen, Hanxiao @ 2014-09-26 10:20 UTC (permalink / raw) To: Eric W. Biederman, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Oleg Nesterov Cc: Richard Weinberger, Serge Hallyn, Oleg Nesterov, Mateusz Guzik, David Howells, Eric W. Biederman [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb2312", Size: 1521 bytes --] Hi, > -----Original Message----- > From: containers-bounces@lists.linux-foundation.org > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of Chen > Hanxiao > Sent: Wednesday, September 24, 2014 6:00 PM > To: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org > Cc: Richard Weinberger; Serge Hallyn; Oleg Nesterov; Mateusz Guzik; David Howells; > Eric W. Biederman > Subject: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns > > If some issues occurred inside a container guest, host user > could not know which process is in trouble just by guest pid: > the users of container guest only knew the pid inside containers. > This will bring obstacle for trouble shooting. > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > a) In init_pid_ns, nothing changed; > > b) In one pidns, will tell the pid inside containers: > NStgid: 21776 5 1 > NSpid: 21776 5 1 > NSpgid: 21776 5 1 > NSsid: 21729 1 0 > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > c) If pidns is nested, it depends on which pidns are you in. > NStgid: 5 1 > NSpid: 5 1 > NSpgid: 5 1 > NSsid: 1 0 > ** Views from level 1 > This patch is simple, useful and safe. But currently there is not any feedbacks. Any comments or ideas? Thanks, - Chen ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-26 10:20 ` Chen, Hanxiao @ 2014-09-29 14:00 ` Serge E. Hallyn 2014-09-30 10:37 ` Chen, Hanxiao 0 siblings, 1 reply; 11+ messages in thread From: Serge E. Hallyn @ 2014-09-29 14:00 UTC (permalink / raw) To: Chen, Hanxiao Cc: Eric W. Biederman, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Oleg Nesterov, Richard Weinberger, Serge Hallyn, Mateusz Guzik, David Howells Quoting Chen, Hanxiao (chenhanxiao@cn.fujitsu.com): > Hi, > > > -----Original Message----- > > From: containers-bounces@lists.linux-foundation.org > > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of Chen > > Hanxiao > > Sent: Wednesday, September 24, 2014 6:00 PM > > To: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org > > Cc: Richard Weinberger; Serge Hallyn; Oleg Nesterov; Mateusz Guzik; David Howells; > > Eric W. Biederman > > Subject: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns > > > > If some issues occurred inside a container guest, host user > > could not know which process is in trouble just by guest pid: > > the users of container guest only knew the pid inside containers. > > This will bring obstacle for trouble shooting. > > > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > > a) In init_pid_ns, nothing changed; > > > > b) In one pidns, will tell the pid inside containers: > > NStgid: 21776 5 1 > > NSpid: 21776 5 1 > > NSpgid: 21776 5 1 > > NSsid: 21729 1 0 > > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > > > c) If pidns is nested, it depends on which pidns are you in. > > NStgid: 5 1 > > NSpid: 5 1 > > NSpgid: 5 1 > > NSsid: 1 0 > > ** Views from level 1 > > > > This patch is simple, useful and safe. > But currently there is not any feedbacks. > > Any comments or ideas? Thanks, Chen. The code looks fine. My concern is that you are exposing information which cannot be checkpointed and restarted. In particular, if I'm inside a nested container, so I'm in pidns level 3, then my own NSpid info, when I read it, will show the pids at parent namespaces. If I'm restarted at the third pidns level, only the one pid can be restored. Now it may be fair to say "this is proc, and proc and sys show host info which is not containerized and cannot be checkpointed and restarted, deal with it." But I'm not sure. There are two ways you could deal with this. One would be to show the nspids only to the level of the reader of the file - but I don't think you need to do that. I think you're better off simply showing the pids up to the level of the struct pid for the mounter of the procfs. So if I'm inside container c2 which is inside container c1, my own /proc will only show pids which are valid in c2 (and any child namespaces), while the /proc mounted in c1 will show pids valid in c1 and c2 (and any children), but not those in the init_pid_ns. It's then just up to the container administrators to make sure that c2 cannot see c1's /proc to confuse itself and confuddle checkpoint-restart -serge ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-29 14:00 ` Serge E. Hallyn @ 2014-09-30 10:37 ` Chen, Hanxiao 2014-09-30 16:05 ` Serge E. Hallyn 0 siblings, 1 reply; 11+ messages in thread From: Chen, Hanxiao @ 2014-09-30 10:37 UTC (permalink / raw) To: Serge E. Hallyn Cc: Eric W. Biederman, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Oleg Nesterov, Richard Weinberger, Serge Hallyn, Mateusz Guzik, David Howells, Pavel Emelyanov (xemul@parallels.com) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb2312", Size: 3114 bytes --] Hi, > -----Original Message----- > From: Serge E. Hallyn [mailto:serge@hallyn.com] > Sent: Monday, September 29, 2014 10:00 PM > Subject: Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to > ns [snip] > > > > > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > > > a) In init_pid_ns, nothing changed; > > > > > > b) In one pidns, will tell the pid inside containers: > > > NStgid: 21776 5 1 > > > NSpid: 21776 5 1 > > > NSpgid: 21776 5 1 > > > NSsid: 21729 1 0 > > > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > > > > > c) If pidns is nested, it depends on which pidns are you in. > > > NStgid: 5 1 > > > NSpid: 5 1 > > > NSpgid: 5 1 > > > NSsid: 1 0 > > > ** Views from level 1 > > > > > > > This patch is simple, useful and safe. > > But currently there is not any feedbacks. > > > > Any comments or ideas? > > Thanks, Chen. The code looks fine. My concern is that you are > exposing information which cannot be checkpointed and restarted. > In particular, if I'm inside a nested container, so I'm in pidns > level 3, then my own NSpid info, when I read it, will show the > pids at parent namespaces. If I'm restarted at the third pidns > level, only the one pid can be restored. If you're in level 3, read your own proc, only level 3's NSpid info will be shown. No parent namesapces info could be seen. Only if not providing a procfs mount point for the new container, and without a proper pivot_root, we could see some NSpid info of parent ns. If each new container got their own procfs mount point, only its and its child's NSpid info could be seen. > > Now it may be fair to say "this is proc, and proc and sys show > host info which is not containerized and cannot be checkpointed > and restarted, deal with it." But I'm not sure. > > There are two ways you could deal with this. One would be to > show the nspids only to the level of the reader of the file - but > I don't think you need to do that. I think you're better off > simply showing the pids up to the level of the struct pid for > the mounter of the procfs. So if I'm inside container c2 which > is inside container c1, my own /proc will only show pids which > are valid in c2 (and any child namespaces), while the /proc > mounted in c1 will show pids valid in c1 and c2 (and any children), > but not those in the init_pid_ns. It's then just up to the > container administrators to make sure that c2 cannot see c1's > /proc to confuse itself and confuddle checkpoint-restart IIUC, this patch already deal with this scenario: + for (g = ns->level; g <= pid->level; g++) + seq_printf(m, "\t%d ", + task_tgid_nr_ns(p, pid->numbers[g].ns)); With this patch, it did like a) in init_pid_ns, check /proc/21776/status NStgid: 21776 5 1 b) in c1, check /proc/5/status: NStgid: 5 1 c) in c2, check /proc/1/status: NStgid: 1 Thanks, - Chen ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-30 10:37 ` Chen, Hanxiao @ 2014-09-30 16:05 ` Serge E. Hallyn 2014-09-30 16:07 ` Serge E. Hallyn 0 siblings, 1 reply; 11+ messages in thread From: Serge E. Hallyn @ 2014-09-30 16:05 UTC (permalink / raw) To: Chen, Hanxiao Cc: Serge E. Hallyn, Eric W. Biederman, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Oleg Nesterov, Richard Weinberger, Serge Hallyn, Mateusz Guzik, David Howells, Pavel Emelyanov (xemul@parallels.com) Quoting Chen, Hanxiao (chenhanxiao@cn.fujitsu.com): > Hi, > > > -----Original Message----- > > From: Serge E. Hallyn [mailto:serge@hallyn.com] > > Sent: Monday, September 29, 2014 10:00 PM > > Subject: Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to > > ns > [snip] > > > > > > > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > > > > a) In init_pid_ns, nothing changed; > > > > > > > > b) In one pidns, will tell the pid inside containers: > > > > NStgid: 21776 5 1 > > > > NSpid: 21776 5 1 > > > > NSpgid: 21776 5 1 > > > > NSsid: 21729 1 0 > > > > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > > > > > > > c) If pidns is nested, it depends on which pidns are you in. > > > > NStgid: 5 1 > > > > NSpid: 5 1 > > > > NSpgid: 5 1 > > > > NSsid: 1 0 > > > > ** Views from level 1 > > > > > > > > > > This patch is simple, useful and safe. > > > But currently there is not any feedbacks. > > > > > > Any comments or ideas? > > > > Thanks, Chen. The code looks fine. My concern is that you are > > exposing information which cannot be checkpointed and restarted. > > In particular, if I'm inside a nested container, so I'm in pidns > > level 3, then my own NSpid info, when I read it, will show the > > pids at parent namespaces. If I'm restarted at the third pidns > > level, only the one pid can be restored. > > If you're in level 3, read your own proc, only level 3's NSpid info > will be shown. No parent namesapces info could be seen. D'oh! Sorry, I see, you're starting at ns->level. And ns is the ns of the proc mount, not the caller. that looks good. So Acked-by: Serge Hallyn <serge.hallyn@canonical.com> > Only if not providing a procfs mount point for the new container, > and without a proper pivot_root, > we could see some NSpid info of parent ns. > > If each new container got their own procfs mount point, > only its and its child's NSpid info could be seen. > > > > > Now it may be fair to say "this is proc, and proc and sys show > > host info which is not containerized and cannot be checkpointed > > and restarted, deal with it." But I'm not sure. > > > > There are two ways you could deal with this. One would be to > > show the nspids only to the level of the reader of the file - but > > I don't think you need to do that. I think you're better off > > simply showing the pids up to the level of the struct pid for > > the mounter of the procfs. So if I'm inside container c2 which > > is inside container c1, my own /proc will only show pids which > > are valid in c2 (and any child namespaces), while the /proc > > mounted in c1 will show pids valid in c1 and c2 (and any children), > > but not those in the init_pid_ns. It's then just up to the > > container administrators to make sure that c2 cannot see c1's > > /proc to confuse itself and confuddle checkpoint-restart > > IIUC, this patch already deal with this scenario: > > + for (g = ns->level; g <= pid->level; g++) > + seq_printf(m, "\t%d ", > + task_tgid_nr_ns(p, pid->numbers[g].ns)); > > With this patch, it did like > a) in init_pid_ns, check /proc/21776/status > NStgid: 21776 5 1 > > b) in c1, check /proc/5/status: > NStgid: 5 1 > > c) in c2, check /proc/1/status: > NStgid: 1 > > Thanks, > - Chen ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns 2014-09-30 16:05 ` Serge E. Hallyn @ 2014-09-30 16:07 ` Serge E. Hallyn 0 siblings, 0 replies; 11+ messages in thread From: Serge E. Hallyn @ 2014-09-30 16:07 UTC (permalink / raw) To: Serge E. Hallyn Cc: Chen, Hanxiao, Eric W. Biederman, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Oleg Nesterov, Richard Weinberger, Serge Hallyn, Mateusz Guzik, David Howells, Pavel Emelyanov (xemul@parallels.com) Quoting Serge E. Hallyn (serge@hallyn.com): > Quoting Chen, Hanxiao (chenhanxiao@cn.fujitsu.com): > > Hi, > > > > > -----Original Message----- > > > From: Serge E. Hallyn [mailto:serge@hallyn.com] > > > Sent: Monday, September 29, 2014 10:00 PM > > > Subject: Re: [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to > > > ns > > [snip] > > > > > > > > > > This patch adds four fields: NStgid, NSpid, NSpgid and NSsid: > > > > > a) In init_pid_ns, nothing changed; > > > > > > > > > > b) In one pidns, will tell the pid inside containers: > > > > > NStgid: 21776 5 1 > > > > > NSpid: 21776 5 1 > > > > > NSpgid: 21776 5 1 > > > > > NSsid: 21729 1 0 > > > > > ** Process id is 21776 in level 0, 5 in level 1, 1 in level 2. > > > > > > > > > > c) If pidns is nested, it depends on which pidns are you in. > > > > > NStgid: 5 1 > > > > > NSpid: 5 1 > > > > > NSpgid: 5 1 > > > > > NSsid: 1 0 > > > > > ** Views from level 1 > > > > > > > > > > > > > This patch is simple, useful and safe. > > > > But currently there is not any feedbacks. > > > > > > > > Any comments or ideas? > > > > > > Thanks, Chen. The code looks fine. My concern is that you are > > > exposing information which cannot be checkpointed and restarted. > > > In particular, if I'm inside a nested container, so I'm in pidns > > > level 3, then my own NSpid info, when I read it, will show the > > > pids at parent namespaces. If I'm restarted at the third pidns > > > level, only the one pid can be restored. > > > > If you're in level 3, read your own proc, only level 3's NSpid info > > will be shown. No parent namesapces info could be seen. > > D'oh! Sorry, I see, you're starting at ns->level. And ns is the ns > of the proc mount, not the caller. that looks good. > > So > > Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Also Tested-by: Serge Hallyn <serge.hallyn@canonical.com> as I've tested this between a few containers. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-09-30 16:07 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-24 10:00 [PATCHv3 0/2] ns, procfs: pid conversion between ns and showing pidns hierarchy Chen Hanxiao 2014-09-24 10:00 ` [PATCHv3 1/2] procfs: show hierarchy of pid namespace Chen Hanxiao 2014-09-24 17:45 ` Mateusz Guzik 2014-09-25 9:45 ` Chen, Hanxiao 2014-09-24 10:00 ` [PATCHv3 2/2] /proc/PID/status: show all sets of pid according to ns Chen Hanxiao 2014-09-25 12:52 ` Chen, Hanxiao 2014-09-26 10:20 ` Chen, Hanxiao 2014-09-29 14:00 ` Serge E. Hallyn 2014-09-30 10:37 ` Chen, Hanxiao 2014-09-30 16:05 ` Serge E. Hallyn 2014-09-30 16:07 ` Serge E. Hallyn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox