linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin-l3A5Bk7waGM@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Andreas Dilger <adilger-xsfywfwIY+M@public.gmane.org>
Subject: Re: [C/R v20][PATCH 38/96] c/r: dump open file descriptors
Date: Mon, 22 Mar 2010 21:30:35 +1100	[thread overview]
Message-ID: <20100322103035.GF17637@laptop> (raw)
In-Reply-To: <1268960401-16680-4-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>

On Thu, Mar 18, 2010 at 08:59:47PM -0400, Oren Laadan wrote:
> @@ -531,6 +533,15 @@ static int init_checkpoint_ctx(struct ckpt_ctx *ctx, pid_t pid)
>  		return -EINVAL;  /* cleanup by ckpt_ctx_free() */
>  	}
>  
> +	/* root vfs (FIX: WILL CHANGE with mnt-ns etc */
> +	task_lock(ctx->root_task);
> +	fs = ctx->root_task->fs;
> +	read_lock(&fs->lock);
> +	ctx->root_fs_path = fs->root;
> +	path_get(&ctx->root_fs_path);
> +	read_unlock(&fs->lock);
> +	task_unlock(ctx->root_task);
> +
>  	return 0;
>  }
>  
> diff --git a/checkpoint/files.c b/checkpoint/files.c
> new file mode 100644
> index 0000000..7a57b24
> --- /dev/null
> +++ b/checkpoint/files.c
> +char *ckpt_fill_fname(struct path *path, struct path *root, char *buf, int *len)
> +{
> +	struct path tmp = *root;
> +	char *fname;
> +
> +	BUG_ON(!buf);
> +	spin_lock(&dcache_lock);
> +	fname = __d_path(path, &tmp, buf, *len);
> +	spin_unlock(&dcache_lock);
> +	if (IS_ERR(fname))
> +		return fname;
> +	*len = (buf + (*len) - fname);
> +	/*
> +	 * FIX: if __d_path() changed these, it must have stepped out of
> +	 * init's namespace. Since currently we require a unified namespace
> +	 * within the container: simply fail.
> +	 */
> +	if (tmp.mnt != root->mnt || tmp.dentry != root->dentry)
> +		fname = ERR_PTR(-EBADF);

Maybe something like this is better in fs/?


> +static int scan_fds(struct files_struct *files, int **fdtable)
> +{
> +	struct fdtable *fdt;
> +	int *fds = NULL;
> +	int i = 0, n = 0;
> +	int tot = CKPT_DEFAULT_FDTABLE;
> +
> +	/*
> +	 * We assume that all tasks possibly sharing the file table are
> +	 * frozen (or we are a single process and we checkpoint ourselves).
> +	 * Therefore, we can safely proceed after krealloc() from where we
> +	 * left off. Otherwise the file table may be modified by another
> +	 * task after we scan it. The behavior is this case is undefined,
> +	 * and either checkpoint or restart will likely fail.
> +	 */
> + retry:
> +	fds = krealloc(fds, tot * sizeof(*fds), GFP_KERNEL);
> +	if (!fds)
> +		return -ENOMEM;
> +
> +	rcu_read_lock();
> +	fdt = files_fdtable(files);
> +	for (/**/; i < fdt->max_fds; i++) {
> +		if (!fcheck_files(files, i))
> +			continue;
> +		if (n == tot) {
> +			rcu_read_unlock();
> +			tot *= 2;	/* won't overflow: kmalloc will fail */
> +			goto retry;
> +		}
> +		fds[n++] = i;
> +	}
> +	rcu_read_unlock();

...

> +static int checkpoint_file_desc(struct ckpt_ctx *ctx,
> +				struct files_struct *files, int fd)
> +{
> +	struct ckpt_hdr_file_desc *h;
> +	struct file *file = NULL;
> +	struct fdtable *fdt;
> +	int objref, ret;
> +	int coe = 0;	/* avoid gcc warning */
> +	pid_t pid;
> +
> +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_FILE_DESC);
> +	if (!h)
> +		return -ENOMEM;
> +
> +	rcu_read_lock();
> +	fdt = files_fdtable(files);
> +	file = fcheck_files(files, fd);
> +	if (file) {
> +		coe = FD_ISSET(fd, fdt->close_on_exec);
> +		get_file(file);
> +	}
> +	rcu_read_unlock();
> +
> +	ret = find_locks_with_owner(file, files);
> +	/*
> +	 * find_locks_with_owner() returns an error when there
> +	 * are no locks found, so we *want* it to return an error
> +	 * code.  Its success means we have to fail the checkpoint.
> +	 */
> +	if (!ret) {
> +		ret = -EBADF;
> +		ckpt_err(ctx, ret, "%(T)fd %d has file lock or lease\n", fd);
> +		goto out;
> +	}
> +
> +	/* sanity check (although this shouldn't happen) */
> +	ret = -EBADF;
> +	if (!file) {
> +		ckpt_err(ctx, ret, "%(T)fd %d gone?\n", fd);
> +		goto out;
> +	}
> +
> +	/*
> +	 * TODO: Implement c/r of fowner and f_sigio.  Should be
> +	 * trivial, but for now we just refuse its checkpoint
> +	 */
> +	pid = f_getown(file);
> +	if (pid) {
> +		ret = -EBUSY;
> +		ckpt_err(ctx, ret, "%(T)fd %d has an owner (%d)\n", fd);
> +		goto out;
> +	}
> +
> +	/*
> +	 * if seen first time, this will add 'file' to the objhash, keep
> +	 * a reference to it, dump its state while at it.
> +	 */

All these kinds of things (including above hunks) IMO are nasty to put
outside fs/. It would be nice to see higher level functionality
implemented in fs and exported to your checkpoint stuff.

Apparently it's hard because checkpointing is so incestuous with
everything, but that's why it's important to structure the code well.

  parent reply	other threads:[~2010-03-22 10:30 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-19  0:59 [C/R v20][PATCH 00/96] Linux Checkpoint-Restart - v20 Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 20/96] c/r: make file_pos_read/write() public Oren Laadan
2010-03-22  6:31   ` Nick Piggin
2010-03-23  0:12     ` Oren Laadan
2010-03-23  0:43       ` Nick Piggin
2010-03-23  0:56         ` Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect() Oren Laadan
2010-03-22  6:34   ` Nick Piggin
2010-03-22 10:16     ` Matt Helsley
2010-03-22 11:00       ` Nick Piggin
2010-03-19  0:59 ` [C/R v20][PATCH 38/96] c/r: dump open file descriptors Oren Laadan
2010-03-19 23:19   ` Andreas Dilger
2010-03-20  4:43     ` Matt Helsley
2010-03-21 17:27       ` Jamie Lokier
2010-03-21 19:40         ` Serge E. Hallyn
2010-03-21 20:58           ` Daniel Lezcano
2010-03-21 21:36             ` Oren Laadan
     [not found]               ` <4BA6914D.8040007-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-21 23:31                 ` xing lin
2010-03-22  8:40               ` Daniel Lezcano
2010-03-22  2:12             ` Matt Helsley
2010-03-22 13:51               ` Jamie Lokier
2010-03-22 23:18               ` Andreas Dilger
2010-03-22  1:06         ` Matt Helsley
2010-03-22  2:20           ` Jamie Lokier
2010-03-22  3:37             ` Matt Helsley
2010-03-22 14:13               ` Jamie Lokier
2010-03-22  2:55           ` Serge E. Hallyn
     [not found]   ` <1268960401-16680-4-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-22 10:30     ` Nick Piggin [this message]
2010-03-22 13:22       ` Matt Helsley
2010-03-22 13:38         ` Nick Piggin
2010-03-19  0:59 ` [C/R v20][PATCH 39/96] c/r: restore " Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 40/96] c/r: introduce method '->checkpoint()' in struct vm_operations_struct Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 44/96] c/r: add generic '->checkpoint' f_op to ext fses Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 45/96] c/r: add generic '->checkpoint()' f_op to simple devices Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 46/96] c/r: add checkpoint operation for opened files of generic filesystems Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 50/96] splice: export pipe/file-to-pipe/file functionality Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 51/96] c/r: support for open pipes Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 52/96] c/r: checkpoint and restore FIFOs Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 53/96] c/r: refuse to checkpoint if monitoring directories with dnotify Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 66/96] c/r: restore file->f_cred Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 82/96] c/r: checkpoint/restart epoll sets Oren Laadan
2010-03-19  0:59 ` [C/R v20][PATCH 83/96] c/r: checkpoint/restart eventfd Oren Laadan
2010-03-19  1:00 ` [C/R v20][PATCH 84/96] c/r: restore task fs_root and pwd (v3) Oren Laadan
2010-03-19  1:00 ` [C/R v20][PATCH 85/96] c/r: preliminary support mounts namespace Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100322103035.GF17637@laptop \
    --to=npiggin-l3a5bk7wagm@public.gmane.org \
    --cc=adilger-xsfywfwIY+M@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).