From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Helsley Subject: Re: [C/R v20][PATCH 37/96] c/r: introduce new 'file_operations': ->checkpoint, ->collect() Date: Mon, 22 Mar 2010 03:16:35 -0700 Message-ID: <20100322101635.GC20796@count0.beaverton.ibm.com> References: <1268960401-16680-1-git-send-email-orenl@cs.columbia.edu> <1268960401-16680-3-git-send-email-orenl@cs.columbia.edu> <20100322063428.GE17637@laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Oren Laadan , linux-fsdevel@vger.kernel.org, containers@lists.linux-foundation.org, Matt Helsley , Andreas Dilger To: Nick Piggin Return-path: Received: from e32.co.us.ibm.com ([32.97.110.150]:52307 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752867Ab0CVKQx (ORCPT ); Mon, 22 Mar 2010 06:16:53 -0400 Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by e32.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id o2MAAHww030398 for ; Mon, 22 Mar 2010 04:10:17 -0600 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o2MAGaJJ128144 for ; Mon, 22 Mar 2010 04:16:38 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o2M3GZTT027911 for ; Sun, 21 Mar 2010 21:16:36 -0600 Content-Disposition: inline In-Reply-To: <20100322063428.GE17637@laptop> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Mar 22, 2010 at 05:34:28PM +1100, Nick Piggin wrote: > On Thu, Mar 18, 2010 at 08:59:46PM -0400, Oren Laadan wrote: > > While we assume all normal files and directories can be checkpointed, > > there are, as usual in the VFS, specialized places that will always > > need an ability to override these defaults. Although we could do this > > completely in the checkpoint code, that would bitrot quickly. > > > > This adds a new 'file_operations' function for checkpointing a file. > > It is assumed that there should be a dirt-simple way to make something > > (un)checkpointable that fits in with current code. > > > > As you can see in the ext[234] patches down the road, all that we have > > to do to make something simple be supported is add a single "generic" > > f_op entry. > > > > Also adds a new 'file_operations' function for 'collecting' a file for > > leak-detection during full-container checkpoint. This is useful for > > those files that hold references to other "collectable" objects. Two > > examples are pty files that point to corresponding tty objects, and > > eventpoll files that refer to the files they are monitoring. > > > > Finally, this patch introduces vfs_fcntl() so that it can be called > > from restart (see patch adding restart of files). > > > > Changelog[v17] > > - Introduce 'collect' method > > Changelog[v17] > > - Forward-declare 'ckpt_ctx' et-al, don't use checkpoint_types.h > > > > Signed-off-by: Oren Laadan > > Acked-by: Serge E. Hallyn > > Tested-by: Serge E. Hallyn > > --- > > fs/fcntl.c | 21 +++++++++++++-------- > > include/linux/fs.h | 7 +++++++ > > 2 files changed, 20 insertions(+), 8 deletions(-) > > > > diff --git a/fs/fcntl.c b/fs/fcntl.c > > index 97e01dc..e1f02ca 100644 > > --- a/fs/fcntl.c > > +++ b/fs/fcntl.c > > @@ -418,6 +418,18 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, > > return err; > > } > > > > +int vfs_fcntl(int fd, unsigned int cmd, unsigned long arg, struct file *filp) > > +{ > > + int err; > > + > > + err = security_file_fcntl(filp, cmd, arg); > > + if (err) > > + goto out; > > + err = do_fcntl(fd, cmd, arg, filp); > > + out: > > + return err; > > +} > > + > > SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg) > > { > > struct file *filp; > > @@ -427,14 +439,7 @@ SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg) > > if (!filp) > > goto out; > > > > - err = security_file_fcntl(filp, cmd, arg); > > - if (err) { > > - fput(filp); > > - return err; > > - } > > - > > - err = do_fcntl(fd, cmd, arg, filp); > > - > > + err = vfs_fcntl(fd, cmd, arg, filp); > > fput(filp); > > out: > > return err; > > There is no point combining these two logically distinct patches. Good point. > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 6c08df2..65ebec5 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -394,6 +394,7 @@ struct kstatfs; > > struct vm_area_struct; > > struct vfsmount; > > struct cred; > > +struct ckpt_ctx; > > > > extern void __init inode_init(void); > > extern void __init inode_init_early(void); > > @@ -1093,6 +1094,8 @@ struct file_lock { > > > > #include > > > > +extern int vfs_fcntl(int fd, unsigned cmd, unsigned long arg, struct file *fp); > > + > > extern void send_sigio(struct fown_struct *fown, int fd, int band); > > > > #ifdef CONFIG_FILE_LOCKING > > @@ -1504,6 +1507,8 @@ struct file_operations { > > ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); > > ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); > > int (*setlease)(struct file *, long, struct file_lock **); > > + int (*checkpoint)(struct ckpt_ctx *, struct file *); > > + int (*collect)(struct ckpt_ctx *, struct file *); > > }; > > > > struct inode_operations { > > You didn't add any documentation for this (unless it is in a following > patch, which it shouldn't be). Another good point -- we should have added that to Documentation/filesystems/vfs.txt > > > @@ -2313,6 +2318,8 @@ void inode_sub_bytes(struct inode *inode, loff_t bytes); > > loff_t inode_get_bytes(struct inode *inode); > > void inode_set_bytes(struct inode *inode, loff_t bytes); > > > > +#define generic_file_checkpoint NULL > > + > > extern int vfs_readdir(struct file *, filldir_t, void *); > > > > extern int vfs_stat(char __user *, struct kstat *); > > Hmm, what does generic_file_checkpoint mean? A NULL checkpoint op means > that checkpointing is allowed, and no action is required? Shouldn't it > be an opt-in operation, where NULL means not allowed? generic_file_checkpoint is for files that have a seek operation and can be backed up or restored with a simple copy. A NULL checkpoint op means "not allowed" as you thought it should. What gave you the impression it was otherwise? Here's the relevant snippet from checkpoint/files.c: /* checkpoint callback for file pointer */ int checkpoint_file(struct ckpt_ctx *ctx, void *ptr) { struct file *file = (struct file *) ptr; int ret; if (!file->f_op || !file->f_op->checkpoint) { ckpt_err(ctx, -EBADF, "%(T)%(P)%(V)f_op lacks checkpoint\n", file, file->f_op); return -EBADF; } > Either way, I don't know if you need to have this #define, provided you > have sufficient documentation. We need it (or a suitable replacement) to avoid adding #ifdef around assignments to the operation in every filesystem. It's used if CONFIG_CHECKPOINT is not defined. Thanks for the review. Cheers, -Matt Helsley