* [PATCH] Introduce Vpid: in /proc/self/status
@ 2011-06-10 9:46 Greg Kurz
[not found] ` <20110610094646.29106.62700.stgit-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Greg Kurz @ 2011-06-10 9:46 UTC (permalink / raw)
To: containers-qjLDD68F18O7TbgM5vRIOg
Cc: legoater-GANU6spQydw, ebiederm-aS9lmoZGLiVWk0Htik3J/w
Since pid namespaces were introduced, there's a recurring demand: how one
can correlate a pid from a child pid ns with a pid from a parent pid ns ?
The need arises in the LXC community when one wants to send a signal from
the host (aka. init_pid_ns context) to a container process for which one
only knows the pid inside the container.
In the future, this should be achievable thanks to Eric Biederman's setns()
syscall but there's still some work to be done to support pid namespaces:
https://lkml.org/lkml/2011/5/21/162
As stated by Serge Hallyn in:
http://sourceforge.net/mailarchive/message.php?msg_id=27424447
"There is nothing that gives you a 100% guaranteed correct race-free
correspondence right now. You can look under /proc/<pid>/root/proc/ to
see the pids valid in the container, and you can relate output of
lxc-ps --forest to ps --forest output. But nothing under /proc that I
know of tells you "this task is the same as that task". You can't
even look at /proc/<pid> inode numbers since they are different
filesystems for each proc mount."
This patch adds a single line to /proc/self/status. Provided one has kept
track of its container tasks (with a cgroup like liblxc does for example),
he may correlate global pids and container pids. This is still racy but
definitely easier than what we have today.
Signed-off-by: Greg Kurz <gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
---
fs/proc/array.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 5e4f776..f9db2a4 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -165,7 +165,8 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
int g;
struct fdtable *fdt = NULL;
const struct cred *cred;
- pid_t ppid, tpid;
+ struct pid_namespace *pid_ns;
+ pid_t ppid, tpid, vpid;
rcu_read_lock();
ppid = pid_alive(p) ?
@@ -176,6 +177,8 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
if (tracer)
tpid = task_pid_nr_ns(tracer, ns);
}
+ pid_ns = task_active_pid_ns(p);
+ vpid = pid_ns ? task_pid_nr_ns(p, pid_ns) : 0;
cred = get_task_cred(p);
seq_printf(m,
"State:\t%s\n"
@@ -183,12 +186,13 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
"Pid:\t%d\n"
"PPid:\t%d\n"
"TracerPid:\t%d\n"
+ "VPid:\t%d\n"
"Uid:\t%d\t%d\t%d\t%d\n"
"Gid:\t%d\t%d\t%d\t%d\n",
get_task_state(p),
task_tgid_nr_ns(p, ns),
pid_nr_ns(pid, ns),
- ppid, tpid,
+ ppid, tpid, vpid,
cred->uid, cred->euid, cred->suid, cred->fsuid,
cred->gid, cred->egid, cred->sgid, cred->fsgid);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Introduce Vpid: in /proc/self/status
[not found] ` <20110610094646.29106.62700.stgit-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org>
@ 2011-06-10 13:33 ` Cedric Le Goater
2011-06-12 1:46 ` Eric W. Biederman
1 sibling, 0 replies; 4+ messages in thread
From: Cedric Le Goater @ 2011-06-10 13:33 UTC (permalink / raw)
To: Greg Kurz
Cc: containers-qjLDD68F18O7TbgM5vRIOg,
ebiederm-aS9lmoZGLiVWk0Htik3J/w
On 06/10/2011 11:46 AM, Greg Kurz wrote:
> Since pid namespaces were introduced, there's a recurring demand: how one
> can correlate a pid from a child pid ns with a pid from a parent pid ns ?
> The need arises in the LXC community when one wants to send a signal from
> the host (aka. init_pid_ns context) to a container process for which one
> only knows the pid inside the container.
>
> In the future, this should be achievable thanks to Eric Biederman's setns()
> syscall but there's still some work to be done to support pid namespaces:
>
> https://lkml.org/lkml/2011/5/21/162
>
> As stated by Serge Hallyn in:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=27424447
>
> "There is nothing that gives you a 100% guaranteed correct race-free
> correspondence right now. You can look under /proc/<pid>/root/proc/ to
> see the pids valid in the container, and you can relate output of
> lxc-ps --forest to ps --forest output. But nothing under /proc that I
> know of tells you "this task is the same as that task". You can't
> even look at /proc/<pid> inode numbers since they are different
> filesystems for each proc mount."
>
> This patch adds a single line to /proc/self/status. Provided one has kept
> track of its container tasks (with a cgroup like liblxc does for example),
> he may correlate global pids and container pids. This is still racy but
> definitely easier than what we have today.
>
> Signed-off-by: Greg Kurz <gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
Acked-by: Cedric Le Goater <clg-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
Thanks,
C.
> ---
>
> fs/proc/array.c | 8 ++++++--
> 1 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 5e4f776..f9db2a4 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -165,7 +165,8 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
> int g;
> struct fdtable *fdt = NULL;
> const struct cred *cred;
> - pid_t ppid, tpid;
> + struct pid_namespace *pid_ns;
> + pid_t ppid, tpid, vpid;
>
> rcu_read_lock();
> ppid = pid_alive(p) ?
> @@ -176,6 +177,8 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
> if (tracer)
> tpid = task_pid_nr_ns(tracer, ns);
> }
> + pid_ns = task_active_pid_ns(p);
> + vpid = pid_ns ? task_pid_nr_ns(p, pid_ns) : 0;
> cred = get_task_cred(p);
> seq_printf(m,
> "State:\t%s\n"
> @@ -183,12 +186,13 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
> "Pid:\t%d\n"
> "PPid:\t%d\n"
> "TracerPid:\t%d\n"
> + "VPid:\t%d\n"
> "Uid:\t%d\t%d\t%d\t%d\n"
> "Gid:\t%d\t%d\t%d\t%d\n",
> get_task_state(p),
> task_tgid_nr_ns(p, ns),
> pid_nr_ns(pid, ns),
> - ppid, tpid,
> + ppid, tpid, vpid,
> cred->uid, cred->euid, cred->suid, cred->fsuid,
> cred->gid, cred->egid, cred->sgid, cred->fsgid);
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Introduce Vpid: in /proc/self/status
[not found] ` <20110610094646.29106.62700.stgit-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org>
2011-06-10 13:33 ` Cedric Le Goater
@ 2011-06-12 1:46 ` Eric W. Biederman
[not found] ` <m139jf4yaf.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
1 sibling, 1 reply; 4+ messages in thread
From: Eric W. Biederman @ 2011-06-12 1:46 UTC (permalink / raw)
To: Greg Kurz; +Cc: containers-qjLDD68F18O7TbgM5vRIOg, legoater-GANU6spQydw
Greg Kurz <gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org> writes:
> Since pid namespaces were introduced, there's a recurring demand: how one
> can correlate a pid from a child pid ns with a pid from a parent pid ns ?
> The need arises in the LXC community when one wants to send a signal from
> the host (aka. init_pid_ns context) to a container process for which one
> only knows the pid inside the container.
You are missing taking the sighand lock which is needed to make
task_active_pid_ns safe when called on something other than current.
I really don't like the name VPid, perhaps Process Pid. There is
nothing more or less virtual about any pid, so calling any of the
virtual is not clear, and misleading.
I'm not exactly certain that /proc/self/status is the right place
for this but this does seem reasonable.
For what it's worth if you are communicating through anything except
a pid file unix domain sockets will give you a race free to get the
pid of the process on the other end.
> In the future, this should be achievable thanks to Eric Biederman's setns()
> syscall but there's still some work to be done to support pid namespaces:
>
> https://lkml.org/lkml/2011/5/21/162
>
> As stated by Serge Hallyn in:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=27424447
>
> "There is nothing that gives you a 100% guaranteed correct race-free
> correspondence right now. You can look under /proc/<pid>/root/proc/ to
> see the pids valid in the container, and you can relate output of
> lxc-ps --forest to ps --forest output. But nothing under /proc that I
> know of tells you "this task is the same as that task". You can't
> even look at /proc/<pid> inode numbers since they are different
> filesystems for each proc mount."
As it happens a unix domain socket can be used to give you a guaranteed
race-free correspondence right now.
So that is what I would recommend if your source of the pid is something
other than a pid file sitting in the filesystem.
> This patch adds a single line to /proc/self/status. Provided one has kept
> track of its container tasks (with a cgroup like liblxc does for example),
> he may correlate global pids and container pids. This is still racy but
> definitely easier than what we have today.
Eric
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Introduce Vpid: in /proc/self/status
[not found] ` <m139jf4yaf.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2011-06-14 16:38 ` Greg Kurz
0 siblings, 0 replies; 4+ messages in thread
From: Greg Kurz @ 2011-06-14 16:38 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: containers-qjLDD68F18O7TbgM5vRIOg, legoater-GANU6spQydw
On Sat, 2011-06-11 at 18:46 -0700, Eric W. Biederman wrote:
> Greg Kurz <gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org> writes:
>
> > Since pid namespaces were introduced, there's a recurring demand: how one
> > can correlate a pid from a child pid ns with a pid from a parent pid ns ?
> > The need arises in the LXC community when one wants to send a signal from
> > the host (aka. init_pid_ns context) to a container process for which one
> > only knows the pid inside the container.
>
Eric,
Thanks for your comments.
> You are missing taking the sighand lock which is needed to make
> task_active_pid_ns safe when called on something other than current.
>
Ohh... to prevent __exit_signal()->__unhash_process() to detach the pids
in our back, correct ? A comment in <linux/pid_namespace.h> would be
appreciated...
> I really don't like the name VPid, perhaps Process Pid. There is
> nothing more or less virtual about any pid, so calling any of the
> virtual is not clear, and misleading.
>
Or 'Active Pid', since we're relying on task_active_pid_ns(). IOW the
output is what the process gets when calling getpid().
> I'm not exactly certain that /proc/self/status is the right place
> for this but this does seem reasonable.
>
Well... I didn't want to add another file and status is the easier one
to patch without breaking anything. It seemed reasonable indeed.
> For what it's worth if you are communicating through anything except
> a pid file unix domain sockets will give you a race free to get the
> pid of the process on the other end.
>
I'm in a pid file scenario for the moment... but this could change, so
I'll give a try to the SCM_CREDENTIALS stuff.
Cheers.
--
Gregory Kurz gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org
Software Engineer @ IBM/Meiosys http://www.ibm.com
Tel +33 (0)534 638 479 Fax +33 (0)561 400 420
"Anarchy is about taking complete responsibility for yourself."
Alan Moore.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-06-14 16:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-10 9:46 [PATCH] Introduce Vpid: in /proc/self/status Greg Kurz
[not found] ` <20110610094646.29106.62700.stgit-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org>
2011-06-10 13:33 ` Cedric Le Goater
2011-06-12 1:46 ` Eric W. Biederman
[not found] ` <m139jf4yaf.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2011-06-14 16:38 ` Greg Kurz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.