All of lore.kernel.org
 help / color / mirror / Atom feed
* Mapping between host & container PIDs ?
@ 2012-11-27 10:15 Daniel P. Berrange
       [not found] ` <20121127101555.GE24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel P. Berrange @ 2012-11-27 10:15 UTC (permalink / raw)
  To: Linux Containers

I'm trying to find out if there is a way to map between host and container
PIDs, at minimum in the host -> container direction. My use case is to be
able to kill processes associated with a container, based on the host PID,
in a race free manner.

Given a host PID, I can read the 'tasks' file for the container's cgroup
to verify that the PID is associated with the container in question. Then
I can kill the PID with a signal. There is a small race condition in there,
where the PID could die & a new process could be born using the original
PID. Now this might not be very likely but I was thinking that if it is
possible to map from a host PID to a container PID, you can do it more
safely. eg Lookup the container PID associted with the host PID, then
setns() into the container and kill the container PID. Now although there
is still a race condition, you are guaranteed that if the race hits you'll
only kill a process within the same container, not the host at large,
which is good when the user invoking the API is unprivileged.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mapping between host & container PIDs ?
       [not found] ` <20121127101555.GE24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-11-27 13:36   ` Serge Hallyn
  2012-11-27 13:47     ` Daniel P. Berrange
  2012-11-27 13:50     ` Eric W. Biederman
  0 siblings, 2 replies; 6+ messages in thread
From: Serge Hallyn @ 2012-11-27 13:36 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Linux Containers

Quoting Daniel P. Berrange (berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> I'm trying to find out if there is a way to map between host and container
> PIDs, at minimum in the host -> container direction. My use case is to be
> able to kill processes associated with a container, based on the host PID,
> in a race free manner.
> 
> Given a host PID, I can read the 'tasks' file for the container's cgroup
> to verify that the PID is associated with the container in question. Then
> I can kill the PID with a signal. There is a small race condition in there,
> where the PID could die & a new process could be born using the original
> PID. Now this might not be very likely but I was thinking that if it is
> possible to map from a host PID to a container PID, you can do it more
> safely. eg Lookup the container PID associted with the host PID, then
> setns() into the container and kill the container PID. Now although there
> is still a race condition, you are guaranteed that if the race hits you'll
> only kill a process within the same container, not the host at large,
> which is good when the user invoking the API is unprivileged.

I'm afraid I don't know of any way to do that.  At some point a new
/proc/self/pids or somesuch file was suggested to get that info.

However, for your use case, what about freezing the container, checking
again that the task exists and is in the container, killing it, then
unfreezing it?

(You also should be able to look at /proc/$pid/cgroups as a perhaps
faster way to verify its container, as opposed to searching
/sys/fs/cgroups/freezer/libvirt/lxc/$container/tasks;  then again it's
more complicated to parse which might offset the searching time in most
cases...)

-serge

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mapping between host & container PIDs ?
  2012-11-27 13:36   ` Serge Hallyn
@ 2012-11-27 13:47     ` Daniel P. Berrange
       [not found]       ` <20121127134759.GL24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2012-11-27 13:50     ` Eric W. Biederman
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel P. Berrange @ 2012-11-27 13:47 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Linux Containers

On Tue, Nov 27, 2012 at 07:36:09AM -0600, Serge Hallyn wrote:
> Quoting Daniel P. Berrange (berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > I'm trying to find out if there is a way to map between host and container
> > PIDs, at minimum in the host -> container direction. My use case is to be
> > able to kill processes associated with a container, based on the host PID,
> > in a race free manner.
> > 
> > Given a host PID, I can read the 'tasks' file for the container's cgroup
> > to verify that the PID is associated with the container in question. Then
> > I can kill the PID with a signal. There is a small race condition in there,
> > where the PID could die & a new process could be born using the original
> > PID. Now this might not be very likely but I was thinking that if it is
> > possible to map from a host PID to a container PID, you can do it more
> > safely. eg Lookup the container PID associted with the host PID, then
> > setns() into the container and kill the container PID. Now although there
> > is still a race condition, you are guaranteed that if the race hits you'll
> > only kill a process within the same container, not the host at large,
> > which is good when the user invoking the API is unprivileged.
> 
> I'm afraid I don't know of any way to do that.  At some point a new
> /proc/self/pids or somesuch file was suggested to get that info.
> 
> However, for your use case, what about freezing the container, checking
> again that the task exists and is in the container, killing it, then
> unfreezing it?

Yep, that's the bulletproof way, but it feels like rather a big hammer
to use

> (You also should be able to look at /proc/$pid/cgroups as a perhaps
> faster way to verify its container, as opposed to searching
> /sys/fs/cgroups/freezer/libvirt/lxc/$container/tasks;  then again it's
> more complicated to parse which might offset the searching time in most
> cases...)

Thinking about it more generally, this isn't really a container specific
problem, but rather an issue with the kill() syscall. It is the same
general class of problem as you see checking file permissions for example,
which is why you would use fstat() instead of stat() in many cases. It
might call for a way to get a FD associated with a pid (eg the /proc/$pid
dir handle) and then be able to kill() via that FD. eg something like

  dirfd = open("/proc/$pid", O_RDONLY);

  exefd = openat(dirfd, "exe", O_RDONLY);
  ...check it is the exe you think it is...

  cgroupfd = openat(dirfd, "cgroups", O_RDONLY);
  ...check the process is where you expect it to be...

  fkill(dirfd, SIG_KILL)

that's probably a whole can of worms though, so I think i'll just
restrict myself to killing processes based on the container's view
of the PID for now.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mapping between host & container PIDs ?
  2012-11-27 13:36   ` Serge Hallyn
  2012-11-27 13:47     ` Daniel P. Berrange
@ 2012-11-27 13:50     ` Eric W. Biederman
       [not found]       ` <87vccrm9xw.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Eric W. Biederman @ 2012-11-27 13:50 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Linux Containers

Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:

> Quoting Daniel P. Berrange (berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
>> I'm trying to find out if there is a way to map between host and container
>> PIDs, at minimum in the host -> container direction. My use case is to be
>> able to kill processes associated with a container, based on the host PID,
>> in a race free manner.
>> 
>> Given a host PID, I can read the 'tasks' file for the container's cgroup
>> to verify that the PID is associated with the container in question. Then
>> I can kill the PID with a signal. There is a small race condition in there,
>> where the PID could die & a new process could be born using the original
>> PID. Now this might not be very likely but I was thinking that if it is
>> possible to map from a host PID to a container PID, you can do it more
>> safely. eg Lookup the container PID associted with the host PID, then
>> setns() into the container and kill the container PID. Now although there
>> is still a race condition, you are guaranteed that if the race hits you'll
>> only kill a process within the same container, not the host at large,
>> which is good when the user invoking the API is unprivileged.
>
> I'm afraid I don't know of any way to do that.  At some point a new
> /proc/self/pids or somesuch file was suggested to get that info.

I do wonder how the checkpoint/restart folks are getting that
information.

If you have the appropriate privileges you can use a unix domain socket.

Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mapping between host & container PIDs ?
       [not found]       ` <20121127134759.GL24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-11-27 21:49         ` Eric W. Biederman
  0 siblings, 0 replies; 6+ messages in thread
From: Eric W. Biederman @ 2012-11-27 21:49 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Linux Containers

"Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> Thinking about it more generally, this isn't really a container specific
> problem, but rather an issue with the kill() syscall. It is the same
> general class of problem as you see checking file permissions for example,
> which is why you would use fstat() instead of stat() in many cases. It
> might call for a way to get a FD associated with a pid (eg the /proc/$pid
> dir handle) and then be able to kill() via that FD. eg something like
>
>
>   dirfd = open("/proc/$pid", O_RDONLY);
>
>   exefd = openat(dirfd, "exe", O_RDONLY);
>   ...check it is the exe you think it is...
>
>   cgroupfd = openat(dirfd, "cgroups", O_RDONLY);
>   ...check the process is where you expect it to be...
>
>   fkill(dirfd, SIG_KILL)
>
> that's probably a whole can of worms though, so I think i'll just
> restrict myself to killing processes based on the container's view
> of the PID for now.

Yes that is the general solution.  It is very reasonable to have a proc
file that you can write to that will send a signal to it's process.

I keep thinking it will be worth implementing one of these days.

Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mapping between host & container PIDs ?
       [not found]       ` <87vccrm9xw.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2012-11-30  0:43         ` Matt Helsley
  0 siblings, 0 replies; 6+ messages in thread
From: Matt Helsley @ 2012-11-30  0:43 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linux Containers

On Tue, Nov 27, 2012 at 07:50:35AM -0600, Eric W. Biederman wrote:
> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> 
> > Quoting Daniel P. Berrange (berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> >> I'm trying to find out if there is a way to map between host and container
> >> PIDs, at minimum in the host -> container direction. My use case is to be
> >> able to kill processes associated with a container, based on the host PID,
> >> in a race free manner.
> >> 
> >> Given a host PID, I can read the 'tasks' file for the container's cgroup
> >> to verify that the PID is associated with the container in question. Then
> >> I can kill the PID with a signal. There is a small race condition in there,
> >> where the PID could die & a new process could be born using the original
> >> PID. Now this might not be very likely but I was thinking that if it is
> >> possible to map from a host PID to a container PID, you can do it more
> >> safely. eg Lookup the container PID associted with the host PID, then
> >> setns() into the container and kill the container PID. Now although there
> >> is still a race condition, you are guaranteed that if the race hits you'll
> >> only kill a process within the same container, not the host at large,
> >> which is good when the user invoking the API is unprivileged.
> >
> > I'm afraid I don't know of any way to do that.  At some point a new
> > /proc/self/pids or somesuch file was suggested to get that info.
> 
> I do wonder how the checkpoint/restart folks are getting that
> information.

Perhaps via the parasite thread? I guess they just inject code that does
getpid(), and, because we know which process they ptrace'd on the host
side, they know the mapping in both pid namespaces.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-30  0:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-27 10:15 Mapping between host & container PIDs ? Daniel P. Berrange
     [not found] ` <20121127101555.GE24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-11-27 13:36   ` Serge Hallyn
2012-11-27 13:47     ` Daniel P. Berrange
     [not found]       ` <20121127134759.GL24370-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-11-27 21:49         ` Eric W. Biederman
2012-11-27 13:50     ` Eric W. Biederman
     [not found]       ` <87vccrm9xw.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-30  0:43         ` Matt Helsley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.