* [BUG][cryo] Create file on restart ?
@ 2008-07-16 18:50 sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716185027.GA1335-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-16 18:50 UTC (permalink / raw)
To: Containers
cryo does not (cannot ?) recreate files if the application created
a file before checkpoint and the file does not exist at the time
of restart.
Note that the 'flags' field in '/proc/$pid/fdinfo/$fd' will not
have the O_CREAT (or O_TRUNC, O_EXCL, O_NOCTTY) flags. These
are cleared in __dentry_open()).
At the time of restart, is there a way for cryo to know that the
file must be created ?
To reproduce:
- run following program,
- checkpoint after the first printf
- rm /tmp/foo1
- restart # fails to open file during restart
---
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/fcntl.h>
main()
{
int fd;
int i;
char *buf = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
fd = open("/tmp/foo1", O_RDWR|O_CREAT|O_TRUNC, 0666);
if (fd < 0) {
perror("open");
exit(1);
}
printf("%d: Opened '/tmp/foo1', fd %d\n", getpid(), fd);
for (i = 0; i < strlen(buf); i++) {
if (write(fd, &buf[i], 1) < 0) {
printf("Error %d writing %c to file, i %d\n",
errno, buf[i], i);
exit(1);
}
printf("%d: i %d, wrote %c\n", getpid(), i, buf[i]);
sleep(2);
}
}
^ permalink raw reply [flat|nested] 12+ messages in thread[parent not found: <20080716185027.GA1335-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716185027.GA1335-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-16 19:26 ` Serge E. Hallyn [not found] ` <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Serge E. Hallyn @ 2008-07-16 19:26 UTC (permalink / raw) To: sukadev-r/Jw6+rmf7HQT0dZR+AlfA; +Cc: Containers Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > cryo does not (cannot ?) recreate files if the application created I think that's for the best. Don't you? The most we should do is make sure cryo has a clear enough error message. -serge > a file before checkpoint and the file does not exist at the time > of restart. > > Note that the 'flags' field in '/proc/$pid/fdinfo/$fd' will not > have the O_CREAT (or O_TRUNC, O_EXCL, O_NOCTTY) flags. These > are cleared in __dentry_open()). > > At the time of restart, is there a way for cryo to know that the > file must be created ? > > To reproduce: > - run following program, > - checkpoint after the first printf > - rm /tmp/foo1 > - restart # fails to open file during restart > > --- > #include <stdio.h> > #include <unistd.h> > #include <errno.h> > #include <sys/fcntl.h> > > main() > { > int fd; > int i; > char *buf = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; > > fd = open("/tmp/foo1", O_RDWR|O_CREAT|O_TRUNC, 0666); > > if (fd < 0) { > perror("open"); > exit(1); > } > printf("%d: Opened '/tmp/foo1', fd %d\n", getpid(), fd); > > for (i = 0; i < strlen(buf); i++) { > if (write(fd, &buf[i], 1) < 0) { > printf("Error %d writing %c to file, i %d\n", > errno, buf[i], i); > exit(1); > } > printf("%d: i %d, wrote %c\n", getpid(), i, buf[i]); > sleep(2); > } > } > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-16 20:45 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA [not found] ` <20080716204529.GA4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2008-07-16 20:59 ` Matt Helsley 1 sibling, 1 reply; 12+ messages in thread From: sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-16 20:45 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): | > | > cryo does not (cannot ?) recreate files if the application created | | I think that's for the best. | | Don't you? I can understand that configuration or data files should exist, but not sure about temporary or log files that an application created upon start-up and expects to be present. Should the admin find out about them and create them by hand before restart ? ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20080716204529.GA4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716204529.GA4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-16 20:57 ` Serge E. Hallyn [not found] ` <20080716205737.GA2082-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Serge E. Hallyn @ 2008-07-16 20:57 UTC (permalink / raw) To: sukadev-r/Jw6+rmf7HQT0dZR+AlfA; +Cc: Containers Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > > | > cryo does not (cannot ?) recreate files if the application created > | > | I think that's for the best. > | > | Don't you? > > I can understand that configuration or data files should exist, but > not sure about temporary or log files that an application created > upon start-up and expects to be present. Should the admin find > out about them and create them by hand before restart ? I think the admin should have set the destination environment such that the task is restarted in the same network fs in the same directory, with no files having been deleted. Am I wrong? -serge ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20080716205737.GA2082-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716205737.GA2082-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-16 21:26 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA [not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-16 21:26 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): | > | > | > | > cryo does not (cannot ?) recreate files if the application created | > | | > | I think that's for the best. | > | | > | Don't you? | > | > I can understand that configuration or data files should exist, but | > not sure about temporary or log files that an application created | > upon start-up and expects to be present. Should the admin find | > out about them and create them by hand before restart ? | | I think the admin should have set the destination environment such that | the task is restarted in the same network fs in the same directory, with | no files having been deleted. or new files created ? For instance if the application was checkpointed before it created a temporary file with O_EXCL flag, that temporary file must not exist when restarting ? | | Am I wrong? So we take a snapshot of the FS and checkpoint the application. Do they need to be atomic ? Eitherway, I withdraw the bug :-) ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-16 22:31 ` Matt Helsley [not found] ` <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2008-07-17 2:18 ` Serge E. Hallyn 2008-07-17 23:22 ` Oren Laadan 2 siblings, 1 reply; 12+ messages in thread From: Matt Helsley @ 2008-07-16 22:31 UTC (permalink / raw) To: sukadev-r/Jw6+rmf7HQT0dZR+AlfA; +Cc: Containers On Wed, 2008-07-16 at 14:26 -0700, sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote: > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > | > > | > | > cryo does not (cannot ?) recreate files if the application created > | > | > | > | I think that's for the best. > | > | > | > | Don't you? > | > > | > I can understand that configuration or data files should exist, but > | > not sure about temporary or log files that an application created > | > upon start-up and expects to be present. Should the admin find > | > out about them and create them by hand before restart ? > | > | I think the admin should have set the destination environment such that > | the task is restarted in the same network fs in the same directory, with > | no files having been deleted. [Assuming Serge meant: s/network fs/network, fs,/] > or new files created ? For instance if the application was checkpointed > before it created a temporary file with O_EXCL flag, that temporary > file must not exist when restarting ? I think that's not a problem given my assumptions above. The filesystem that the application restarts in would be the same because the admin should have set up the restart environment as Serge suggested. The admin can't rely on restart in an alternate environment. However, given knowledge of the application and environment, using an alternate environment may be a risk the admin is willing to take. > | > | Am I wrong? > > So we take a snapshot of the FS and checkpoint the application. Do they > need to be atomic ? If all the applications in a container are frozen then I think we can get fs snapshots consistent with checkpointed applications. Otherwise, yes, I think we'd be gambling that the checkpointed application isn't interacting with another, running, application via an intermittently-shared file. Cheers, -Matt ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2008-07-16 23:20 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA 2008-07-17 2:21 ` Serge E. Hallyn 1 sibling, 0 replies; 12+ messages in thread From: sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-16 23:20 UTC (permalink / raw) To: Matt Helsley; +Cc: Containers | If all the applications in a container are frozen then I think we can | get fs snapshots consistent with checkpointed applications. I agree in general, but cryo currently takes a checkpoint and the application resumes, which means the application could create the temp file. So, cryo should not resume the application until the FS snapshot is taken too I guess. | Otherwise, yes, I think we'd be gambling that the checkpointed | application isn't interacting with another, running, application via an | intermittently-shared file. Yes. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG][cryo] Create file on restart ? [not found] ` <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2008-07-16 23:20 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-17 2:21 ` Serge E. Hallyn [not found] ` <20080717022134.GB21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Serge E. Hallyn @ 2008-07-17 2:21 UTC (permalink / raw) To: Matt Helsley; +Cc: Containers Quoting Matt Helsley (matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > On Wed, 2008-07-16 at 14:26 -0700, sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote: > > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > > | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > | > | > > > | > | > cryo does not (cannot ?) recreate files if the application created > > | > | > > | > | I think that's for the best. > > | > | > > | > | Don't you? > > | > > > | > I can understand that configuration or data files should exist, but > > | > not sure about temporary or log files that an application created > > | > upon start-up and expects to be present. Should the admin find > > | > out about them and create them by hand before restart ? > > | > > | I think the admin should have set the destination environment such that > > | the task is restarted in the same network fs in the same directory, with > > | no files having been deleted. > > [Assuming Serge meant: s/network fs/network, fs,/] Well no I meant a network filesystem - at least if you're migrating apps around a cluster. > > or new files created ? For instance if the application was checkpointed > > before it created a temporary file with O_EXCL flag, that temporary > > file must not exist when restarting ? > > I think that's not a problem given my assumptions above. The filesystem > that the application restarts in would be the same because the admin > should have set up the restart environment as Serge suggested. The admin > can't rely on restart in an alternate environment. However, given > knowledge of the application and environment, using an alternate > environment may be a risk the admin is willing to take. Yup. But Suka is right that in the case of the checkpointed app continuing to run for a bit before being killed and restarted, it could get out of whack with respect to the file system. > > | Am I wrong? > > > > So we take a snapshot of the FS and checkpoint the application. Do they > > need to be atomic ? > > If all the applications in a container are frozen then I think we can > get fs snapshots consistent with checkpointed applications. > Otherwise, yes, I think we'd be gambling that the checkpointed > application isn't interacting with another, running, application via an > intermittently-shared file. What fun :) I wonder whether the experience of users of c/r on sgi and cray could teach us anything here. -serge ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20080717022134.GB21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080717022134.GB21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2008-07-17 23:35 ` Oren Laadan 0 siblings, 0 replies; 12+ messages in thread From: Oren Laadan @ 2008-07-17 23:35 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers Serge E. Hallyn wrote: > Quoting Matt Helsley (matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): >> On Wed, 2008-07-16 at 14:26 -0700, sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote: >>> Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: >>> | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): >>> | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: >>> | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): >>> | > | > >>> | > | > cryo does not (cannot ?) recreate files if the application created >>> | > | >>> | > | I think that's for the best. >>> | > | >>> | > | Don't you? >>> | > >>> | > I can understand that configuration or data files should exist, but >>> | > not sure about temporary or log files that an application created >>> | > upon start-up and expects to be present. Should the admin find >>> | > out about them and create them by hand before restart ? >>> | >>> | I think the admin should have set the destination environment such that >>> | the task is restarted in the same network fs in the same directory, with >>> | no files having been deleted. >> [Assuming Serge meant: s/network fs/network, fs,/] > > Well no I meant a network filesystem - at least if you're migrating apps > around a cluster. > >>> or new files created ? For instance if the application was checkpointed >>> before it created a temporary file with O_EXCL flag, that temporary >>> file must not exist when restarting ? >> I think that's not a problem given my assumptions above. The filesystem >> that the application restarts in would be the same because the admin >> should have set up the restart environment as Serge suggested. The admin >> can't rely on restart in an alternate environment. However, given >> knowledge of the application and environment, using an alternate >> environment may be a risk the admin is willing to take. > > Yup. But Suka is right that in the case of the checkpointed app > continuing to run for a bit before being killed and restarted, it could > get out of whack with respect to the file system. > >>> | Am I wrong? >>> >>> So we take a snapshot of the FS and checkpoint the application. Do they >>> need to be atomic ? >> If all the applications in a container are frozen then I think we can >> get fs snapshots consistent with checkpointed applications. >> Otherwise, yes, I think we'd be gambling that the checkpointed >> application isn't interacting with another, running, application via an >> intermittently-shared file. > > What fun :) > > I wonder whether the experience of users of c/r on sgi and cray could > teach us anything here. if you are checkpointing to migrate the application - you need not worry about the file system, as it may not change while you migrate. if you are checkpointing to be able to be able to recover from an error later, you need to snapshot the file system, but you may get away with it in some cases. if you are checkpointing to be able to travel back in time (return to older than last checkpoint), you certainly need to snapshot the file system. in any event, I think this is something we may want to discuss in the mini-summit. Oren. > > -serge > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2008-07-16 22:31 ` Matt Helsley @ 2008-07-17 2:18 ` Serge E. Hallyn 2008-07-17 23:22 ` Oren Laadan 2 siblings, 0 replies; 12+ messages in thread From: Serge E. Hallyn @ 2008-07-17 2:18 UTC (permalink / raw) To: sukadev-r/Jw6+rmf7HQT0dZR+AlfA; +Cc: Containers Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > | > > | > | > cryo does not (cannot ?) recreate files if the application created > | > | > | > | I think that's for the best. > | > | > | > | Don't you? > | > > | > I can understand that configuration or data files should exist, but > | > not sure about temporary or log files that an application created > | > upon start-up and expects to be present. Should the admin find > | > out about them and create them by hand before restart ? > | > | I think the admin should have set the destination environment such that > | the task is restarted in the same network fs in the same directory, with > | no files having been deleted. > > or new files created ? For instance if the application was checkpointed > before it created a temporary file with O_EXCL flag, that temporary > file must not exist when restarting ? > > | > | Am I wrong? > > So we take a snapshot of the FS and checkpoint the application. Do they > need to be atomic ? > > Eitherway, I withdraw the bug :-) Well it's certainly beyond the scope of cryo. I'd prefer if we didn't have to snapshot the fs at each checkpoint (!) and I think any many or most cases (think one long-running scientific app or seti@home that just occasionally gets migrated or stopped for a reboot) it won't be an issue. But your point seems valid. -serge ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2008-07-16 22:31 ` Matt Helsley 2008-07-17 2:18 ` Serge E. Hallyn @ 2008-07-17 23:22 ` Oren Laadan 2 siblings, 0 replies; 12+ messages in thread From: Oren Laadan @ 2008-07-17 23:22 UTC (permalink / raw) To: sukadev-r/Jw6+rmf7HQT0dZR+AlfA; +Cc: Containers sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org wrote: > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > Serge E. Hallyn [serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org] wrote: > | > | Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > | > | > > | > | > cryo does not (cannot ?) recreate files if the application created > | > | > | > | I think that's for the best. > | > | > | > | Don't you? > | > > | > I can understand that configuration or data files should exist, but > | > not sure about temporary or log files that an application created > | > upon start-up and expects to be present. Should the admin find > | > out about them and create them by hand before restart ? > | > | I think the admin should have set the destination environment such that > | the task is restarted in the same network fs in the same directory, with > | no files having been deleted. > > or new files created ? For instance if the application was checkpointed > before it created a temporary file with O_EXCL flag, that temporary > file must not exist when restarting ? > > | > | Am I wrong? > > So we take a snapshot of the FS and checkpoint the application. Do they > need to be atomic ? Yes they do, in the sense that the FS must be snapshotted when the container is quiescent to ensure consistency. Oren. > > Eitherway, I withdraw the bug :-) > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG][cryo] Create file on restart ? [not found] ` <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2008-07-16 20:45 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA @ 2008-07-16 20:59 ` Matt Helsley 1 sibling, 0 replies; 12+ messages in thread From: Matt Helsley @ 2008-07-16 20:59 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers On Wed, 2008-07-16 at 14:26 -0500, Serge E. Hallyn wrote: > Quoting sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > > > cryo does not (cannot ?) recreate files if the application created > > I think that's for the best. > > Don't you? I agree. I think drawing the line for process checkpoint/restart before preserving the contents of mounted filesystems is very reasonable since mounted filesystem(s) can already be preserved with your choice of tool(s). I think it also gives us more options as far as using checkpointed images for error recovery; if we take the concept of checkpoint too far we may limit ourselves to merely reproducing errors rather than also giving ourselves a means to recover from errors. Cheers, -Matt Helsley > -serge > > > a file before checkpoint and the file does not exist at the time > > of restart. > > > > Note that the 'flags' field in '/proc/$pid/fdinfo/$fd' will not > > have the O_CREAT (or O_TRUNC, O_EXCL, O_NOCTTY) flags. These > > are cleared in __dentry_open()). > > > > At the time of restart, is there a way for cryo to know that the > > file must be created ? > > > > To reproduce: > > - run following program, > > - checkpoint after the first printf > > - rm /tmp/foo1 > > - restart # fails to open file during restart > > > > --- > > #include <stdio.h> > > #include <unistd.h> > > #include <errno.h> > > #include <sys/fcntl.h> > > > > main() > > { > > int fd; > > int i; > > char *buf = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; > > > > fd = open("/tmp/foo1", O_RDWR|O_CREAT|O_TRUNC, 0666); > > > > if (fd < 0) { > > perror("open"); > > exit(1); > > } > > printf("%d: Opened '/tmp/foo1', fd %d\n", getpid(), fd); > > > > for (i = 0; i < strlen(buf); i++) { > > if (write(fd, &buf[i], 1) < 0) { > > printf("Error %d writing %c to file, i %d\n", > > errno, buf[i], i); > > exit(1); > > } > > printf("%d: i %d, wrote %c\n", getpid(), i, buf[i]); > > sleep(2); > > } > > } > > _______________________________________________ > > Containers mailing list > > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > > https://lists.linux-foundation.org/mailman/listinfo/containers > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-07-17 23:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-16 18:50 [BUG][cryo] Create file on restart ? sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716185027.GA1335-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 19:26 ` Serge E. Hallyn
[not found] ` <20080716192604.GA27454-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 20:45 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716204529.GA4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 20:57 ` Serge E. Hallyn
[not found] ` <20080716205737.GA2082-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 21:26 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
[not found] ` <20080716212609.GB4278-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 22:31 ` Matt Helsley
[not found] ` <1216247460.4844.177.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-07-16 23:20 ` sukadev-r/Jw6+rmf7HQT0dZR+AlfA
2008-07-17 2:21 ` Serge E. Hallyn
[not found] ` <20080717022134.GB21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-17 23:35 ` Oren Laadan
2008-07-17 2:18 ` Serge E. Hallyn
2008-07-17 23:22 ` Oren Laadan
2008-07-16 20:59 ` Matt Helsley
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.