From: Daniel Lezcano <daniel.lezcano@free.fr>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
linux-fsdevel@vger.kernel.org,
containers@lists.linux-foundation.org,
Jamie Lokier <jamie@shareable.org>,
Andreas Dilger <adilger@sun.com>
Subject: Re: [C/R v20][PATCH 38/96] c/r: dump open file descriptors
Date: Mon, 22 Mar 2010 09:40:32 +0100 [thread overview]
Message-ID: <4BA72D00.7040406@free.fr> (raw)
In-Reply-To: <4BA6914D.8040007@cs.columbia.edu>
Oren Laadan wrote:
>
>
> Daniel Lezcano wrote:
>> Serge E. Hallyn wrote:
>>> Quoting Jamie Lokier (jamie@shareable.org):
>>>
>>>> Matt Helsley wrote:
>>>>
>>>>>> That said, if the intent is to allow the restore to be done on
>>>>>> another node with a "similar" filesystem (e.g. created by rsync/node
>>>>>> image), instead of having a coherent distributed filesystem on all
>>>>>> of the nodes then the filename makes sense.
>>>>>>
>>>>> Yes, this is the intent.
>>>>>
>>>> I would worry about programs which are using files which have been
>>>> deleted, renamed, or (very common) renamed-over by another process
>>>> after being opened, as there's a good chance they will successfully
>>>> open the wrong file after c/r, and corrupt state from then on.
>>>>
>>> Userspace is expected to back up and restore the filesystem, for
>>> instance using a btrfs snapshot or a simple rsync or tar.
>>>
>>>
>> That does not solve the problem Jamie is talking about.
>> A rsync or a tar will not see a deleted file and using a btrfs to
>> have the CR to work with the deleted files is a bit overkill, no ?
>
> Let's separate the issues of file system snapshot and deleted files.
>
> 1) File system snapshot:
> ------------------------
> The requirement is to preserve the file system state between the time
> of the checkpoint and the time of the restart, because userspace will
> expect it to remain the same.
>
> The alternatives are:
>
> a) Use capable file system, like brfs, or (modified) nilfs.
>
> b) Userspace saves the state e.g. w/ tar or rsync (maybe incremental)
>
> c) Assume/expect that the file system isn't modified between checkpoint
> and restart (e.g. if we use c/r to suspend a user's session)
>
> d) Expect userspace to adapt to changes if they occur, e.g. by having
> the application be aware of the possibility, or by providing a wrapper
> that will do some magic prior to restart (by looking at the checkpoint
> image).
>
> Options a,b,c are all transparent to the application, while option
> d required that applications become aware of c/r. That's ok, but our
> primary goal is to be generic enough to unmodified applications.
>
> 2) Deleted files:
> -----------------
> The requirement is that at restart we'll be able to restore the file
> point in the kernel to a deleted file with same properties and contents
> as it was at the time of the checkpoint.
>
> The alternatives we considered are:
>
> e) For each deleted file, save the contents of that file as part of
> the checkpoint image;
> At restart - create a new file, populate with the contents, open it
> (to get an active file pointer), and finally unlink it, so it is -
> again - deleted.
>
> f) At checkpoint time, create a file (from scratch) in a dedicated
> area of the file system (userspace configurable?), and copy the
> contents of the deleted file to this file. Only save the file system
> state after this is done.
> At restart, open the alternative file instead, and then immediately
> delete it.
>
> g) At checkpoint time, re-link the file to a dedicated area of the
> file system. This requires support from the underlying file system,
> of course. For instance, it's trivial for ext2,3 but IIRC will need
> help for ext4. Re-linking is essentially attaching a new filename
> to an existing inode that is still referenced but is otherwise not
> reachable - and make it reachable again.
> At restart, open the re-linked file and then immediately delete it.
>
>> I have another question about the deleted files. How is handled the
>> case when a process has a deleted mapped file but without an
>> associated file descriptor ?
>>
>
> It works the same as with non-deleted files (assuming that we know
> how to handle delete files in general, e.g. options e,d,f above):
>
> To checkpoint a task's mm we loop through the vma's and checkpoint
> them. For a vma that corresponds to a mapped file, we first save
> the vma->vm_file. In turn, for a file pointer we save the filename,
> properties, credentials. A file pointer is saved as an independent
> object - and is assigned a unique id - objref. The state of the vma
> will indicate indicate this objref.
>
> At restart, we will first see the file pointer object, and will
> open the file to create a corresponding file pointer. Later when
> we restore the vma, we'll locate the (new) file pointer using the
> objref and use it in mmap.
>
> Oren.
>
Thanks Oren for the detailed answer.
next prev parent reply other threads:[~2010-03-22 8:40 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-19 0:59 [C/R v20][PATCH 00/96] Linux Checkpoint-Restart - v20 Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 20/96] c/r: make file_pos_read/write() public Oren Laadan
2010-03-22 6:31 ` Nick Piggin
2010-03-23 0:12 ` Oren Laadan
[not found] ` <Pine.LNX.4.64.1003221959450.1520-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2010-03-23 0:43 ` Nick Piggin
2010-03-23 0:43 ` Nick Piggin
2010-03-23 0:56 ` Oren Laadan
2010-03-23 0:56 ` Oren Laadan
2010-03-23 0:12 ` Oren Laadan
[not found] ` <1268960401-16680-2-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-22 6:31 ` Nick Piggin
2010-03-19 0:59 ` [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect() Oren Laadan
2010-03-22 6:34 ` Nick Piggin
2010-03-22 10:16 ` Matt Helsley
2010-03-22 10:16 ` Matt Helsley
[not found] ` <20100322101635.GC20796-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-22 11:00 ` Nick Piggin
2010-03-22 11:00 ` Nick Piggin
[not found] ` <1268960401-16680-3-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-22 6:34 ` Nick Piggin
2010-03-19 0:59 ` [C/R v20][PATCH 38/96] c/r: dump open file descriptors Oren Laadan
2010-03-19 23:19 ` Andreas Dilger
2010-03-20 4:43 ` Matt Helsley
[not found] ` <20100320044310.GC2887-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-21 17:27 ` Jamie Lokier
2010-03-21 17:27 ` Jamie Lokier
2010-03-21 19:40 ` Serge E. Hallyn
2010-03-21 20:58 ` Daniel Lezcano
2010-03-21 21:36 ` Oren Laadan
[not found] ` <4BA6914D.8040007-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-21 23:31 ` xing lin
2010-03-22 8:40 ` Daniel Lezcano
2010-03-22 8:40 ` Daniel Lezcano [this message]
2010-03-22 2:12 ` Matt Helsley
2010-03-22 13:51 ` Jamie Lokier
2010-03-22 23:18 ` Andreas Dilger
[not found] ` <20100322021242.GI2887-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-22 13:51 ` Jamie Lokier
2010-03-22 23:18 ` Andreas Dilger
[not found] ` <4BA68884.3080003-GANU6spQydw@public.gmane.org>
2010-03-21 21:36 ` Oren Laadan
2010-03-22 2:12 ` Matt Helsley
[not found] ` <20100321194019.GA11714-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2010-03-21 20:58 ` Daniel Lezcano
[not found] ` <20100321172703.GC4174-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2010-03-21 19:40 ` Serge E. Hallyn
2010-03-22 1:06 ` Matt Helsley
2010-03-22 1:06 ` Matt Helsley
[not found] ` <20100322010606.GG2887-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-22 2:20 ` Jamie Lokier
2010-03-22 2:55 ` Serge E. Hallyn
2010-03-22 2:20 ` Jamie Lokier
[not found] ` <20100322022003.GA16462-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2010-03-22 3:37 ` Matt Helsley
2010-03-22 3:37 ` Matt Helsley
2010-03-22 14:13 ` Jamie Lokier
[not found] ` <20100322033724.GA20796-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-22 14:13 ` Jamie Lokier
2010-03-22 2:55 ` Serge E. Hallyn
[not found] ` <F18D161D-850B-4C82-83D5-1F19D573E84F-xsfywfwIY+M@public.gmane.org>
2010-03-20 4:43 ` Matt Helsley
[not found] ` <1268960401-16680-4-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-19 23:19 ` Andreas Dilger
2010-03-22 10:30 ` Nick Piggin
2010-03-22 13:22 ` Matt Helsley
2010-03-22 13:22 ` Matt Helsley
2010-03-22 13:38 ` Nick Piggin
[not found] ` <20100322132232.GD20796-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-22 13:38 ` Nick Piggin
2010-03-19 0:59 ` [C/R v20][PATCH 39/96] c/r: restore " Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 40/96] c/r: introduce method '->checkpoint()' in struct vm_operations_struct Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 44/96] c/r: add generic '->checkpoint' f_op to ext fses Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 45/96] c/r: add generic '->checkpoint()' f_op to simple devices Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 46/96] c/r: add checkpoint operation for opened files of generic filesystems Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 50/96] splice: export pipe/file-to-pipe/file functionality Oren Laadan
[not found] ` <1268960401-16680-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-19 0:59 ` [C/R v20][PATCH 20/96] c/r: make file_pos_read/write() public Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect() Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 38/96] c/r: dump open file descriptors Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 39/96] c/r: restore " Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 40/96] c/r: introduce method '->checkpoint()' in struct vm_operations_struct Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 44/96] c/r: add generic '->checkpoint' f_op to ext fses Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 45/96] c/r: add generic '->checkpoint()' f_op to simple devices Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 46/96] c/r: add checkpoint operation for opened files of generic filesystems Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 50/96] splice: export pipe/file-to-pipe/file functionality Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 51/96] c/r: support for open pipes Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 52/96] c/r: checkpoint and restore FIFOs Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 53/96] c/r: refuse to checkpoint if monitoring directories with dnotify Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 66/96] c/r: restore file->f_cred Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 82/96] c/r: checkpoint/restart epoll sets Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 83/96] c/r: checkpoint/restart eventfd Oren Laadan
2010-03-19 1:00 ` [C/R v20][PATCH 84/96] c/r: restore task fs_root and pwd (v3) Oren Laadan
2010-03-19 1:00 ` [C/R v20][PATCH 85/96] c/r: preliminary support mounts namespace Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 51/96] c/r: support for open pipes Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 52/96] c/r: checkpoint and restore FIFOs Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 53/96] c/r: refuse to checkpoint if monitoring directories with dnotify Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 66/96] c/r: restore file->f_cred Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 82/96] c/r: checkpoint/restart epoll sets Oren Laadan
2010-03-19 0:59 ` [C/R v20][PATCH 83/96] c/r: checkpoint/restart eventfd Oren Laadan
2010-03-19 1:00 ` [C/R v20][PATCH 84/96] c/r: restore task fs_root and pwd (v3) Oren Laadan
2010-03-19 1:00 ` [C/R v20][PATCH 85/96] c/r: preliminary support mounts namespace Oren Laadan
-- strict thread matches above, loose matches on Subject: below --
2010-03-17 16:07 [C/R v20][PATCH 00/96] Linux Checkpoint-Restart - v20 Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 01/96] eclone (1/11): Factor out code to allocate pidmap page Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 02/96] eclone (2/11): Have alloc_pidmap() return actual error code Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 03/96] eclone (3/11): Define set_pidmap() function Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 04/96] eclone (4/11): Add target_pids parameter to alloc_pid() Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 05/96] eclone (5/11): Add target_pids parameter to copy_process() Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 06/96] eclone (6/11): Check invalid clone flags Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 07/96] eclone (7/11): Define do_fork_with_pids() Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 08/96] eclone (8/11): Implement sys_eclone for x86 (32,64) Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 09/96] eclone (9/11): Implement sys_eclone for s390 Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 10/96] eclone (10/11): Implement sys_eclone for powerpc Oren Laadan
2010-03-17 16:07 ` [C/R v20][PATCH 11/96] eclone (11/11): Document sys_eclone Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 12/96] c/r: extend arch_setup_additional_pages() Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 13/96] c/r: break out new_user_ns() Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 14/96] c/r: split core function out of some set*{u,g}id functions Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 15/96] cgroup freezer: Fix buggy resume test for tasks frozen with cgroup freezer Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 16/96] cgroup freezer: Update stale locking comments Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 17/96] cgroup freezer: Add CHECKPOINTING state to safeguard container checkpoint Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 18/96] cgroup freezer: interface to freeze a cgroup from within the kernel Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 19/96] Namespaces submenu Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 20/96] c/r: make file_pos_read/write() public Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 21/96] c/r: create syscalls: sys_checkpoint, sys_restart Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 22/96] c/r: documentation Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 23/96] c/r: basic infrastructure for checkpoint/restart Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 24/96] c/r: x86_32 support " Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 25/96] c/r: x86-64: checkpoint/restart implementation Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 26/96] c/r: external checkpoint of a task other than ourself Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 27/96] c/r: export functionality used in next patch for restart-blocks Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 28/96] c/r: restart-blocks Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 29/96] c/r: checkpoint multiple processes Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 30/96] c/r: restart " Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 31/96] c/r: introduce PF_RESTARTING, and skip notification on exit Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 32/96] c/r: support for zombie processes Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 33/96] c/r: Save and restore the [compat_]robust_list member of the task struct Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 34/96] c/r: infrastructure for shared objects Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 35/96] c/r: detect resource leaks for whole-container checkpoint Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 36/96] deferqueue: generic queue to defer work Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect() Oren Laadan
2010-03-17 16:08 ` [C/R v20][PATCH 38/96] c/r: dump open file descriptors Oren Laadan
2010-03-17 16:08 ` Oren Laadan
[not found] ` <1268842164-5590-38-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-17 16:08 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BA72D00.7040406@free.fr \
--to=daniel.lezcano@free.fr \
--cc=adilger@sun.com \
--cc=containers@lists.linux-foundation.org \
--cc=jamie@shareable.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=orenl@cs.columbia.edu \
--cc=serge@hallyn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.