From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Linus Torvalds <torvalds-3NddpPZAyC0@public.gmane.org>,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC v10][PATCH 09/13] Restore open file descriprtors
Date: Mon, 01 Dec 2008 15:41:53 -0500 [thread overview]
Message-ID: <49344C11.6090204@cs.columbia.edu> (raw)
In-Reply-To: <1228159324.2971.74.camel@nimitz>
Dave Hansen wrote:
> On Fri, 2008-11-28 at 11:27 +0000, Al Viro wrote:
>> On Wed, Nov 26, 2008 at 08:04:40PM -0500, Oren Laadan wrote:
>>> +/**
>>> + * cr_attach_get_file - attach (and get) lonely file ptr to a file descriptor
>>> + * @file: lonely file pointer
>>> + */
>>> +static int cr_attach_get_file(struct file *file)
>>> +{
>>> + int fd = get_unused_fd_flags(0);
>>> +
>>> + if (fd >= 0) {
>>> + fsnotify_open(file->f_path.dentry);
>>> + fd_install(fd, file);
>>> + get_file(file);
>>> + }
>>> + return fd;
>>> +}
>> What happens if another thread closes the descriptor in question between
>> fd_install() and get_file()?
>
> You're just saying to flip the get_file() and fd_install()?
Indeed.
>
>>> + fd = cr_attach_file(file); /* no need to cleanup 'file' below */
>>> + if (fd < 0) {
>>> + filp_close(file, NULL);
>>> + ret = fd;
>>> + goto out;
>>> + }
>>> +
>>> + /* register new <objref, file> tuple in hash table */
>>> + ret = cr_obj_add_ref(ctx, file, parent, CR_OBJ_FILE, 0);
>>> + if (ret < 0)
>>> + goto out;
>> Who said that file still exists at that point?
Correct. This call should move higher up befor ethe call to cr_attach_file()
>
> Ahhh. We're depending on the 'struct file' reference that comes from
> the fd table. That's why there is supposedly "no need to cleanup 'file'
> below". But, some other thread can come along and close() the fd, which
> will __fput() our poor 'struct file' and will make it go away. Next
> time we go and pull it out of the hash table, we go boom.
>
> As a quick fix, I think we can just take another get_file() here. But,
> as Al notes, there are some much larger issues that we face with the
> fd_table and multi-thread access. They haven't "mattered" to us so far
> because we assume everything is either single-threaded or frozen.
> Sounds like Al isn't comfortable with this being integrated until a much
> more detailed look has been taken.
>
>> BTW, there are shitloads of races here - references to fd and struct file *
>> are mixed in a way that breaks *badly* if descriptor table is played with
>> by another thread.
The assumption about tasks being frozen and no additional sharing is generally
more strict, more likely to hold, and easier to enforce for the restart.
Besides the race pointed above which would crash the kernel, the other races
are "ok" - if the user abuses the interface, then the results are "undefined"
(refer to my reply to "..PATCH 808/13] Dump open file descriptors").
Here, too, by "undefined" I mean that the restart syscall may fail, and if it
completes successfully the resulting set of tasks is not guaranteed to behave
correctly. In contrast, if the user uses the interface correctly (ensuring
that the assumption holds), then restart is guaranteed to succeed. Note that
even when the outcome is undefined, there are no security issues - all actions
are limited to what the initiating user can do.
> One of the things about this that bothers me is that it shares too
> little with existing VFS code. It calls into a ton of existing stuff
> but doesn't refactor anything that is currently there. Surely there are
> some common bits somewhere in the VFS that could be consolidated here.
Actually, the code alternates between "file" and "fd", in attempt to resuse
existing code and not do things ourselves:
ret = sys_fcntl(fd, F_SETFL, hh->f_flags & CR_SETFL_MASK);
if (ret < 0)
goto out;
ret = vfs_llseek(file, hh->f_pos, SEEK_SET);
if (ret == -ESPIPE) /* ignore error on non-seekable files */
ret = 0;
This is still safe: the file struct is protected with a reference count. If
the fd no longer points to the same struct file, then either it will fail
(e.g. if the fd is invalid) or the restart will eventually succeed but the
resulting state of the tasks will be incorrect (that is: undefined behavior).
Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-12-01 20:41 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-27 1:04 [RFC v10][PATCH 00/13] Kernel based checkpoint/restart Oren Laadan
[not found] ` <1227747884-14150-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-27 1:04 ` [RFC v10][PATCH 01/13] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 02/13] Checkpoint/restart: initial documentation Oren Laadan
[not found] ` <1227747884-14150-3-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:45 ` Al Viro
[not found] ` <20081128104554.GP28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 18:15 ` Dave Hansen
2008-11-27 1:04 ` [RFC v10][PATCH 03/13] General infrastructure for checkpoint restart Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 04/13] x86 support for checkpoint/restart Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 05/13] Dump memory address space Oren Laadan
[not found] ` <1227747884-14150-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:53 ` Al Viro
[not found] ` <20081128105351.GQ28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 18:00 ` Dave Hansen
2008-12-01 20:57 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 06/13] Restore " Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 07/13] Infrastructure for shared objects Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 08/13] Dump open file descriptors Oren Laadan
[not found] ` <1227747884-14150-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:19 ` Al Viro
[not found] ` <20081128101919.GO28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 17:47 ` Dave Hansen
2008-12-01 20:23 ` Oren Laadan
[not found] ` <493447DD.7010102-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 20:51 ` Dave Hansen
2008-12-01 21:02 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0812011258390.3256-nfNrOhbfy2R17+2ddN/4kux8cNe9sq/dYPYVAmT7z5s@public.gmane.org>
2008-12-01 21:25 ` Dave Hansen
2008-12-01 21:20 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 09/13] Restore open file descriprtors Oren Laadan
[not found] ` <1227747884-14150-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 11:27 ` Al Viro
[not found] ` <20081128112745.GR28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 19:22 ` Dave Hansen
2008-12-01 20:41 ` Oren Laadan [this message]
[not found] ` <49344C11.6090204-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 20:54 ` Dave Hansen
2008-12-01 21:00 ` Oren Laadan
[not found] ` <49345086.4-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 21:07 ` Dave Hansen
2008-12-02 1:31 ` Dave Hansen
2008-12-02 1:12 ` Dave Hansen
2008-11-27 1:04 ` [RFC v10][PATCH 10/13] External checkpoint of a task other than ourself Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 11/13] Track in-kernel when we expect checkpoint/restart to work Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 12/13] Checkpoint multiple processes Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 13/13] Restart " Oren Laadan
2008-12-03 23:58 ` [RFC v10][PATCH 00/13] Kernel based checkpoint/restart Serge E. Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49344C11.6090204@cs.columbia.edu \
--to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mingo-X9Un+BFzKDI@public.gmane.org \
--cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=torvalds-3NddpPZAyC0@public.gmane.org \
--cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).