From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Serge E. Hallyn" Subject: Re: c/r of pdeath Date: Fri, 19 Jun 2009 17:35:25 -0500 Message-ID: <20090619223525.GA401@us.ibm.com> References: <20090619182114.GA27320@us.ibm.com> <4A3C1084.3080305@cs.columbia.edu> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <4A3C1084.3080305-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Oren Laadan Cc: Linux Containers List-Id: containers.vger.kernel.org Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org): > > > Serge E. Hallyn wrote: > > Hi Oren, > > > > commit 9a45e26c0aabda6a94e2ac620befd8ee12a7363d adds > > reset of pdeath_signal. It does so unconditionally. I > > don't think that's safe. Perhaps if pdeath_signal is > > anything other than 0, it should only be restored if > > the task is capable(CAP_KILL)? > > Hmmm... maybe I'm missing something here, but -- Nope, you're not. I was thinking wrong. > pdeath_signal indicates that the process wishes to receive > a signal, not to send one. It may change through prctl() > without requiring any capabilities from the caller. Finally > it is reset at fork/clone. > > So at worse it will kill the specific task that holds it ? > > -- > > As a side note - for a brief moment I worried that it may > break restart with zombies, if the to-be-zombie process has > a child that already restarted (including pdeath_signal) and > then exits, then the child will receive a signal unwillingly. > > I then realized that it's safe as long as we restore parents > before their children. In turn this depends on the checkpoint > order, which indeed operates this way. > > Otherwise we would have needed set this to all processes > after all zombies indeed have terminated - which means another > sync point at restart, or a sweep by coordinator on all tasks. > > Oren.