Linux Container Development
 help / color / mirror / Atom feed
From: sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [RFC][PATCH][cryo] Save/restore state of unnamed pipes
Date: Wed, 18 Jun 2008 14:56:12 -0700	[thread overview]
Message-ID: <20080618215612.GA27603@us.ibm.com> (raw)
In-Reply-To: <1213819055.28482.50.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

Matt Helsley [matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote:
| 
| > So let me rephrase the problem.
| > 
| > Suppose the checkpointed application was using fds in following
| > "orig-fd-set"
| > 
| > 	{ [0..10], 18, 27 }
| > 
| > where 18 and 27 are part of a pipe. For simplicity lets assume that
| > 18 is the read-side-fd.
| 
| so orig_pipefd[0] == 18
| and orig_pipefd[1] == 27
| 
| > We checkpointed this application and are now trying to restart it.
| > 
| > In the restarted application, we would call
| > 
| > 	dup2(fd1, fd2),
| > 
| > where 'fd1' is some new, random fd and 'fd2' is an fd in 'orig-fd-set'
|                            ^^^^^^  Even if they were truly random, this
| does not preclude fd1 from having the same value as an fd in the
| remaining orig-fd-set -- such as one of the two we're about to try and
| restart with pipe().

I agree. fd1 could be an hither-to-unseen fd from the 'orig-fd-set'.

| 
| > (say fd2 = 18).
| 
| fd1 = restarted_pipefd[0]
| fd2 = restarted_pipefd[1]
| 
| In my example fd1 == 27 and fd2 == 18
| 
| > IIUC, there is a risk here of 'fd2' being closed accidentally while
| > it is in use.
| 
| 	Yes, that's the risk.
| 
| > But, the only way I can see 'fd2' being in use in the restarted process
| > is if _cryo_ opened some file _during_ restart and did not close. I ran
| 
| 	Both file descriptors returned from pipe() are in use during restart
| and closing one of them would not be proper. Cryo hasn't "forgotten" to
| close one of them -- cryo needs to dup2() both of them to their
| "destination" fds. But if they have been swapped or if just one is the
| "destination" of the other then you could end up with a broken pipe.

Ok I see what you are saying. 

The assumption I have is that we would process the fds from 'orig-fd-set'
in ascending order. Its good to confirm that assumption now :-)

proc_readfd_common() seems to return the fds in ascending order (so
readdir() of "/proc/pid/fd/" would get them in ascending order - no ?)

If we process 'orig-fd-set' in order and suppose we create the pipe for
the smaller of the two fds (could be the write-side). Then the other side
of the pipe would either not collide with an existing fd or that
fd would not be in the 'orig-fd-set' (in the latter case it would
be safe for dup2() to close).

| 
| > into this early on with the randomize_va_space file (which was easily
| > fixed).
| 
| 	This logic only works if cryo only has one new fd at a time. However
| that's not possible with pipe(). Or socketpair(). In those cases one of
| the two new fds could be the "destination" fd for the other. In that
| case dup2() will kindly close it for you and break your new
| pipe/socketpair! :)
| 
| 	That's why I asked if POSIX guarantees the read side file descriptor is
| always less than the write side. If it does then the numbers can't be
| swapped and maybe using your assumption that we don't have any other fds
| accidentally left open ensures dup2() will be safe.

I don't think POSIX guarantees, but will double check.

| 
| > Would cryo need to keep one or more temporary/debug files open in the
| > restarted process (i.e files that are not in the 'orig-fd-set').
| 
| 	There's no need to keep temporary/debug files open that I can see. Just
| a need to be careful when more than one new file descriptor has been
| created before doing a dup2().
| 
| > If cryo does, then maybe it could open such files:
| > 
| > 	- after clone() (so files are not open in restarted process), or
| > 
| > 	- find the last_fd used and dup2() to that fd, leaving the
| > 	  'orig-fd-set' all open/available for restarted process 
| > 
| > For debug, before each 'dup2(fd1, fd2)' we could 'fstat(fd2, &buf)'
| > to ensure 'fd2' is not in use and error out if it is.
| 
| fstat() could certainly be useful for debugging dup2(). However it still
| doesn't nicely show us whether there are any fds we've leaked that we
| forgot about unless we fstat() all possible fds and then compare the set
| of existing fds to the orig-fd-set.

Yes, was suggesting fstat() only to detect collisions, but yes, to
detect leaks, we have to do more.

  parent reply	other threads:[~2008-06-18 21:56 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-17 21:29 [RFC][PATCH][cryo] Save/restore state of unnamed pipes sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found] ` <20080617212950.GB11826-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-17 22:30   ` Serge E. Hallyn
     [not found]     ` <20080617223039.GB26942-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-17 22:55       ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
2008-06-17 23:31       ` Matt Helsley
     [not found]         ` <1213745472.16057.37.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-06-18  0:32           ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found]             ` <20080618003214.GA14699-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-18  2:04               ` Matt Helsley
     [not found]                 ` <1213754646.16057.107.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-06-18 18:00                   ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found]                     ` <20080618180025.GA25261-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-18 19:57                       ` Matt Helsley
     [not found]                         ` <1213819055.28482.50.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-06-18 21:56                           ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA [this message]
     [not found]                             ` <20080618215612.GA27603-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-18 22:38                               ` Matt Helsley
     [not found]                                 ` <1213828689.28482.92.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-06-18 22:55                                   ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
2008-06-19  7:59                   ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found]                     ` <20080619075953.GA4295-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-19 23:46                       ` Matt Helsley
     [not found]                         ` <1213919160.28482.279.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-06-20  1:54                           ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
     [not found]                             ` <20080620015410.GA13775-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-06-20  2:48                               ` Matt Helsley
2008-06-22 21:40   ` Oren Laadan
     [not found]     ` <485EC6E4.4080707-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-06-23 23:30       ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080618215612.GA27603@us.ibm.com \
    --to=sukadev-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox