From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [BUG][cryo] Create file on restart ?
Date: Wed, 16 Jul 2008 21:21:34 -0500 [thread overview]
Message-ID: <20080717022134.GB21726@us.ibm.com> (raw)
In-Reply-To: <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
Quoting Matt Helsley (matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
>
> On Wed, 2008-07-16 at 14:26 -0700, sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote:
> > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote:
> > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
> > | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote:
> > | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
> > | > | >
> > | > | > cryo does not (cannot ?) recreate files if the application created
> > | > |
> > | > | I think that's for the best.
> > | > |
> > | > | Don't you?
> > | >
> > | > I can understand that configuration or data files should exist, but
> > | > not sure about temporary or log files that an application created
> > | > upon start-up and expects to be present. Should the admin find
> > | > out about them and create them by hand before restart ?
> > |
> > | I think the admin should have set the destination environment such that
> > | the task is restarted in the same network fs in the same directory, with
> > | no files having been deleted.
>
> [Assuming Serge meant: s/network fs/network, fs,/]
Well no I meant a network filesystem - at least if you're migrating apps
around a cluster.
> > or new files created ? For instance if the application was checkpointed
> > before it created a temporary file with O_EXCL flag, that temporary
> > file must not exist when restarting ?
>
> I think that's not a problem given my assumptions above. The filesystem
> that the application restarts in would be the same because the admin
> should have set up the restart environment as Serge suggested. The admin
> can't rely on restart in an alternate environment. However, given
> knowledge of the application and environment, using an alternate
> environment may be a risk the admin is willing to take.
Yup. But Suka is right that in the case of the checkpointed app
continuing to run for a bit before being killed and restarted, it could
get out of whack with respect to the file system.
> > | Am I wrong?
> >
> > So we take a snapshot of the FS and checkpoint the application. Do they
> > need to be atomic ?
>
> If all the applications in a container are frozen then I think we can
> get fs snapshots consistent with checkpointed applications.
> Otherwise, yes, I think we'd be gambling that the checkpointed
> application isn't interacting with another, running, application via an
> intermittently-shared file.
What fun :)
I wonder whether the experience of users of c/r on sgi and cray could
teach us anything here.
-serge
next prev parent reply other threads:[~2008-07-17 2:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-16 18:50 [BUG][cryo] Create file on restart ? sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716185027.GA1335-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 19:26 ` Serge E. Hallyn
[not found] ` <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 20:45 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716204529.GA4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 20:57 ` Serge E. Hallyn
[not found] ` <20080716205737.GA2082-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 21:26 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 22:31 ` Matt Helsley
[not found] ` <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-07-16 23:20 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
2008-07-17 2:21 ` Serge E. Hallyn [this message]
[not found] ` <20080717022134.GB21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-17 23:35 ` Oren Laadan
2008-07-17 2:18 ` Serge E. Hallyn
2008-07-17 23:22 ` Oren Laadan
2008-07-16 20:59 ` Matt Helsley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080717022134.GB21726@us.ibm.com \
--to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox