From: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Cyrill Gorcunov
<gorcunov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Linux Containers
<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
Nathan Lynch <ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
Subject: Re: [TOOLS] To make use of the patches
Date: Sat, 23 Jul 2011 12:32:04 +0400 [thread overview]
Message-ID: <4E2A8704.3030306@parallels.com> (raw)
In-Reply-To: <20110722234558.GD16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
>> #define PIPEFS_MAGIC 0x50495045
>
> Shouldn't there be only one MAGIC number for checkpoint contents?
>
> You can always add an additional "type" number following the magic
> number. Or make the type a string with the name of the /proc file it's
> from... etc.
Don't get your idea here, can you elaborate please?
>> static void continue_task(int pid)
>> {
>> if (kill(pid, SIGCONT))
>> perror("Can't cont task");
>> }
>
> Eventually, I think you should use the cgroup freezer here rather
> than signals. Shells and debuggers use these signals so a checkpoint
> could easily and quietly be corrupted.
Yes sure! As I told, I will switch to one in the 2nd iteration.
> Even if you use the freezer, there needs to be a mechanism to
> assure that the frozen cgroup is not thawed before a consistent
> checkpoint is complete. Otherwise corruption is always a possibility.
Yes, this is a good point. I'm thinking about it.
>> static int dump_pipe_and_data(int lfd, struct pipes_entry *e)
>> {
>> int steal_pipe[2];
>> int ret;
>>
>> printf("\tDumping data from pipe %x\n", e->pipeid);
>> if (pipe(steal_pipe) < 0) {
>> perror("Can't create pipe for stealing data");
>> return 1;
>> }
>>
>> ret = tee(lfd, steal_pipe[1], MAX_PIPE_BUF_SIZE, SPLICE_F_NONBLOCK);
>
> Neat application of tee().
Thanks! :)
>> if (ret < 0) {
>> if (errno != EAGAIN) {
>> perror("Can't pick pipe data");
>> return 1;
>> }
>>
>> ret = 0;
>> }
>>
>> e->bytes = ret;
>> write(pipes_img, e, sizeof(*e));
>>
>> if (ret) {
>> ret = splice(steal_pipe[0], NULL, pipes_img, NULL, ret, 0);
>> if (ret < 0) {
>> perror("Can't push pipe data");
>> return 1;
>> }
>> }
>>
>> close(steal_pipe[0]);
>> close(steal_pipe[1]);
>> return 0;
>> }
>>
>> static int dump_one_pipe(int fd, int lfd, unsigned int id, unsigned int flags)
>> {
>> struct pipes_entry e;
>>
>> printf("\tDumping pipe %d/%x flags %x\n", fd, id, flags);
>>
>> e.fd = fd;
>> e.pipeid = id;
>> e.flags = flags;
>>
>> if (flags & O_WRONLY) {
>> e.bytes = 0;
>> write(pipes_img, &e, sizeof(e));
>> return 0;
>> }
>>
>> return dump_pipe_and_data(lfd, &e);
>> }
>>
>> static int dump_one_fd(int dir, char *fd_name, unsigned long pos, unsigned int flags)
>> {
>> int fd;
>> struct stat st_buf;
>> struct statfs stfs_buf;
>>
>> printf("\tDumping fd %s\n", fd_name);
>> fd = openat(dir, fd_name, O_RDONLY);
>> if (fd == -1) {
>> printf("Tried to openat %d/%d %s\n", getpid(), dir, fd_name);
>> perror("Can't open fd");
>> return 1;
>> }
>>
>> if (fstat(fd, &st_buf) < 0) {
>> perror("Can't stat one");
>> return 1;
>> }
>>
>> if (S_ISREG(st_buf.st_mode))
>> return dump_one_reg_file(FDINFO_FD, atoi(fd_name), fd, 1, pos, flags);
>>
>> if (S_ISFIFO(st_buf.st_mode)) {
>> if (fstatfs(fd, &stfs_buf) < 0) {
>> perror("Can't statfs one");
>> return 1;
>> }
>>
>> if (stfs_buf.f_type == PIPEFS_MAGIC)
>> return dump_one_pipe(atoi(fd_name), fd, st_buf.st_ino, flags);
>> }
>
> This is starting to look like a linear search over the set of all
> possible types of things file descriptors can refer to. A kernel implementation
> doesn't have to do this. Furthermore, if lots of file descriptors are open
> this could be alot of fstat() and fstatfs() calls -- will making so many
> syscalls force us to an completely in-kernel implementation, like the
> set already proposed, just to get usable performance?
A kernel implementation doesn't have to do any syscalls at all. If we're going to
do it in kernel, then we should throw this set away and resurrect the Oren's set.
As far as the many fstats is concerned - yes, some sort of optimization about this
is surely required.
>>
>> if (!strcmp(fd_name, "0")) {
>> printf("\tSkipping stdin\n");
>> return 0;
>> }
>
> Assuming that fd 0 is "stdin" is very very gross. Yes, it's almost always
> true. But that does *not* mean that it's a pty. stdin could be a pipe
> we need to checkpoint. Really, this is also about the "type" of thing
> the fd is referring to -- not about which fd nr it is.
>
> What are your plans for removing this?
This was done just to make it possible to demonstrate what this code can do
checkpointing shell scripts and restoring them in (probably) another session.
The plan for this part is - implement the c/r support for terminals and throw
this explicit check for stdio-s away :)
>> static unsigned long rawhex(char *str, char **end)
>> {
>> unsigned long ret = 0;
>>
>> while (1) {
>> if (str[0] >= '0' && str[0] <= '9') {
>> ret <<= 4;
>> ret += str[0] - '0';
>> } else if (str[0] >= 'a' && str[0] <= 'f') {
>> ret <<= 4;
>> ret += str[0] - 'a' + 0xA;
>> } else if (str[0] >= 'A' && str[0] <= 'F') {
>> ret <<= 4;
>> ret += str[0] - 'A' + 0xA;
>> } else {
>> if (end)
>> *end = str;
>> return ret;
>> }
>>
>> str++;
>> }
>> }
>
> nit: I haven't looked closely enough to see where rawhex is being used,
> but is there's no suitable library function for this?
Well, I looked for but did found. All I've met required an 0x to precede the hex number.
If you point me one - I will gladly replace mine with it.
>> static int dump_file_shared_map(char *start, char *mdesc, int lfd)
>> {
>> printf("\tSkipping file shared mapping at %s\n", start);
>> close(lfd);
>> return 0;
>> }
>
> Shouldn't this be an error since it appears these shared mappings
> are currently unsupported?
Why unsupported? Shared file mappings are fully supported, unless some bug found its
way into the source.
>> printf("%d/%d EXEC IMAGE\n", pid, getpid());
>> return execl(path, path, NULL);
>
> How are you going to restore O_CLOEXEC flags?
Don't know yet. But assuming we have agreed on using execve for restoring tasks, then the solution
is - just set this flag and call exec. Since my binary handler doesn't call the setup_new_exec
(which closes the files) these bits will be preserved
> For any subsequent postings could you split this up into multiple
> emails -- perhaps one per file?
OK, will do this.
> Or perhaps make them patches to the kernel's tools directory?
Hm... I didn't think about having these tools be the part of the kernel source tree.
Maybe it would be better if I publish the tools in git repo, what do you think?
> Cheers,
> -Matt Helsley
> .
>
next prev parent reply other threads:[~2011-07-23 8:32 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-15 13:45 [RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace Pavel Emelyanov
[not found] ` <4E204466.8010204-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-15 13:45 ` [PATCH 0/1] proc: Introduce the /proc/<pid>/mfd/ directory Pavel Emelyanov
[not found] ` <4E20448A.5010207-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 7:21 ` Tejun Heo
2011-07-15 13:46 ` [PATCH 2/7] vfs: Introduce the fd closing helper Pavel Emelyanov
[not found] ` <4E2044A7.4030103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 15:47 ` Serge E. Hallyn
2011-07-15 13:46 ` [PATCH 3/7] proc: Introduce the Children: line in /proc/<pid>/status Pavel Emelyanov
[not found] ` <4E2044C3.7050506-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 6:54 ` Tejun Heo
[not found] ` <20110721065436.GT3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-23 8:06 ` Pavel Emelyanov
[not found] ` <4E2A8116.1040309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 8:41 ` Tejun Heo
[not found] ` <20110723084110.GG21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23 8:45 ` Pavel Emelyanov
[not found] ` <4E2A8A0E.5030208-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 8:50 ` Tejun Heo
[not found] ` <20110723085014.GI21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23 8:51 ` Pavel Emelyanov
2011-07-21 15:54 ` Serge E. Hallyn
2011-07-15 13:47 ` [PATCH 4/7] vfs: Add ->statfs callback for pipefs Pavel Emelyanov
[not found] ` <4E2044D6.3060205-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 6:59 ` Tejun Heo
2011-07-21 15:59 ` Serge E. Hallyn
2011-07-15 13:47 ` [PATCH 5/7] clone: Introduce the CLONE_CHILD_USEPID functionality Pavel Emelyanov
[not found] ` <4E2044EB.20001-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 16:04 ` Serge E. Hallyn
[not found] ` <20110721160459.GD19012-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2011-07-22 23:08 ` Matt Helsley
[not found] ` <20110722230848.GB16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 8:09 ` Pavel Emelyanov
2011-07-15 13:47 ` [PATCH 6/7] proc: Introduce the /proc/<pid>/dump file Pavel Emelyanov
[not found] ` <4E204500.6040800-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-16 22:57 ` Kirill A. Shutemov
[not found] ` <20110716225709.GA25606-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>
2011-07-17 8:06 ` Cyrill Gorcunov
2011-07-21 6:44 ` Tejun Heo
[not found] ` <20110721064408.GR3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-23 8:11 ` Pavel Emelyanov
[not found] ` <4E2A8239.5060908-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 8:37 ` Tejun Heo
[not found] ` <20110723083711.GF21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23 8:49 ` Pavel Emelyanov
[not found] ` <4E2A8B12.4010709-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 8:58 ` Tejun Heo
2011-07-15 13:48 ` [PATCH 7/7] binfmt: Introduce the binfmt_img exec handler Pavel Emelyanov
[not found] ` <4E204519.3040804-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 6:51 ` Tejun Heo
[not found] ` <20110721065127.GS3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-22 22:46 ` Matt Helsley
[not found] ` <20110722224617.GA16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 8:17 ` Pavel Emelyanov
[not found] ` <4E2A83AC.6090504-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 8:45 ` Tejun Heo
[not found] ` <20110723084529.GH21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23 8:51 ` Pavel Emelyanov
[not found] ` <4E2A8B7D.8010807-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23 9:04 ` Tejun Heo
2011-07-15 13:49 ` [TOOLS] To make use of the patches Pavel Emelyanov
[not found] ` <4E204554.6040901-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-22 23:45 ` Matt Helsley
[not found] ` <20110722234558.GD16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 8:32 ` Pavel Emelyanov [this message]
[not found] ` <4E2A8704.3030306-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-27 23:00 ` Matt Helsley
[not found] ` <20110727230003.GE15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-28 8:23 ` James Bottomley
2011-07-23 0:40 ` Reply #2: " Matt Helsley
[not found] ` <20110723004045.GC21563-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 8:33 ` Pavel Emelyanov
2011-07-15 15:01 ` [RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace Tejun Heo
2011-07-18 13:27 ` Serge E. Hallyn
[not found] ` <20110718132759.GB8127-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2011-07-23 8:43 ` Pavel Emelyanov
2011-07-23 0:25 ` Matt Helsley
[not found] ` <20110723002558.GE16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 3:29 ` Matt Helsley
[not found] ` <20110723032945.GD21563-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23 4:58 ` Tejun Heo
[not found] ` <20110723045842.GD21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 18:11 ` Matt Helsley
[not found] ` <20110726181128.GD14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 22:45 ` Tejun Heo
[not found] ` <20110726224525.GC28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 23:07 ` Matt Helsley
2011-07-23 3:53 ` Tejun Heo
[not found] ` <CAOS58YPqLSYi2xECUk4O5GG3s6aokT=VykmkL6UnAOzyHXNAgQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-07-26 22:59 ` Matt Helsley
[not found] ` <20110726225911.GF14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 23:46 ` Tejun Heo
[not found] ` <20110726234657.GD28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-27 0:53 ` Matt Helsley
[not found] ` <20110727005341.GB15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-27 10:12 ` Tejun Heo
[not found] ` <20110727101228.GY2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-27 22:26 ` Matt Helsley
2011-07-23 5:10 ` Tejun Heo
[not found] ` <20110723051005.GE21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 22:02 ` Matt Helsley
[not found] ` <20110726220215.GE14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 22:21 ` Tejun Heo
[not found] ` <20110726222109.GB28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-27 0:06 ` Matt Helsley
[not found] ` <20110727000651.GA15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-27 12:01 ` Tejun Heo
[not found] ` <20110727120114.GZ2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-27 21:35 ` Matt Helsley
[not found] ` <20110727213510.GC15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-28 7:21 ` Tejun Heo
[not found] ` <20110728072141.GB2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-28 7:23 ` Tejun Heo
2011-07-28 8:37 ` James Bottomley
2011-07-28 9:10 ` Tejun Heo
2011-07-23 8:39 ` Pavel Emelyanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E2A8704.3030306@parallels.com \
--to=xemul-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org \
--cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=gorcunov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org \
--cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.