linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] FUSE: Notifying the kernel of deletion.
@ 2011-12-06 18:50 John Muir
  2011-12-06 20:03 ` J. Bruce Fields
  0 siblings, 1 reply; 4+ messages in thread
From: John Muir @ 2011-12-06 18:50 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, Linux-Fsdevel

Allows a FUSE file-system to tell the kernel when a file or directory is deleted. If the specified dentry has the specified inode number, the kernel will unhash it.

Signed-off-by: John Muir <john-eXjPKP/gKhgAvxtiuMwx3w@public.gmane.org>
---
Please find below a patch that add notification of deletion to the FUSE kernel interface. These patches allow the file-system to tell the kernel when a file (and more particularly) a directory is deleted. This is needed because using the current 'notify_inval_entry' does not cause the kernel to clean up directories that are in use properly, and as a result the users of those directories see incorrect semantics from the file-system. The error condition seen when 'notify_inval_entry' is used to notify of a deleted directory is avoided when 'notify_delete' is used instead.

I'll demonstrate below with the following scenario:
 1. User A chdirs into 'testdir' and starts reading 'testfile'.
 2. User B rm -rf 'testdir'.
 3. User B creates 'testdir'.
 4. User C chdirs into 'testdir'.

If you run the above within the same machine on any file-system (including fuse file-systems), there is no problem: user C is able to chdir into the new testdir. The old testdir is removed from the dentry tree, but still open by user A.

If, on the other hand, the operations 2 and 3 are performed via the network such that the fuse file-system uses one of the notify functions to tell the kernel that the nodes are gone, then the following error occurs for user C while user A holds the original directory open:

 muirj@empacher:~> ls /test/testdir
 ls: cannot access /test/testdir: No such file or directory

The issue here is that the kernel still has a dentry for testdir, and so it is requesting the attributes for the old directory, while my file-system is responding that the directory no longer exists.

If on the other hand, if the file-system can notify the kernel that the directory is deleted using the new 'notify_delete' function, then the above ls will find the new directory as expected.


diff -updr orig/fs/fuse/dev.c new/fs/fuse/dev.c
--- orig/fs/fuse/dev.c	2011-12-01 23:56:01.000000000 +0100
+++ new/fs/fuse/dev.c	2011-12-06 19:30:50.682000123 +0100
@@ -1378,7 +1378,59 @@ static int fuse_notify_inval_entry(struc
 	down_read(&fc->killsb);
 	err = -ENOENT;
 	if (fc->sb)
-		err = fuse_reverse_inval_entry(fc->sb, outarg.parent, &name);
+		err = fuse_reverse_inval_entry(fc->sb, outarg.parent, 0, &name);
+	up_read(&fc->killsb);
+	kfree(buf);
+	return err;
+
+err:
+	kfree(buf);
+	fuse_copy_finish(cs);
+	return err;
+}
+
+static int fuse_notify_delete(struct fuse_conn *fc, unsigned int size,
+			      struct fuse_copy_state *cs)
+{
+	struct fuse_notify_delete_out outarg;
+	int err = -ENOMEM;
+	char *buf;
+	struct qstr name;
+
+	buf = kzalloc(FUSE_NAME_MAX + 1, GFP_KERNEL);
+	if (!buf)
+		goto err;
+
+	err = -EINVAL;
+	if (size < sizeof(outarg))
+		goto err;
+
+	err = fuse_copy_one(cs, &outarg, sizeof(outarg));
+	if (err)
+		goto err;
+
+	err = -ENAMETOOLONG;
+	if (outarg.namelen > FUSE_NAME_MAX)
+		goto err;
+
+	err = -EINVAL;
+	if (size != sizeof(outarg) + outarg.namelen + 1)
+		goto err;
+
+	name.name = buf;
+	name.len = outarg.namelen;
+	err = fuse_copy_one(cs, buf, outarg.namelen + 1);
+	if (err)
+		goto err;
+	fuse_copy_finish(cs);
+	buf[outarg.namelen] = 0;
+	name.hash = full_name_hash(name.name, name.len);
+
+	down_read(&fc->killsb);
+	err = -ENOENT;
+	if (fc->sb)
+		err = fuse_reverse_inval_entry(fc->sb, outarg.parent,
+                			       outarg.child, &name);
 	up_read(&fc->killsb);
 	kfree(buf);
 	return err;
@@ -1596,6 +1648,9 @@ static int fuse_notify(struct fuse_conn
 	case FUSE_NOTIFY_RETRIEVE:
 		return fuse_notify_retrieve(fc, size, cs);
 
+	case FUSE_NOTIFY_DELETE:
+		return fuse_notify_delete(fc, size, cs);
+
 	default:
 		fuse_copy_finish(cs);
 		return -EINVAL;
diff -updr orig/fs/fuse/dir.c new/fs/fuse/dir.c
--- orig/fs/fuse/dir.c	2011-12-01 23:56:01.000000000 +0100
+++ new/fs/fuse/dir.c	2011-12-06 19:30:50.683000127 +0100
@@ -868,7 +868,7 @@ int fuse_update_attributes(struct inode
 }
 
 int fuse_reverse_inval_entry(struct super_block *sb, u64 parent_nodeid,
-			     struct qstr *name)
+			     u64 child_nodeid, struct qstr *name)
 {
 	int err = -ENOTDIR;
 	struct inode *parent;
@@ -895,8 +895,36 @@ int fuse_reverse_inval_entry(struct supe
 
 	fuse_invalidate_attr(parent);
 	fuse_invalidate_entry(entry);
+
+	if (child_nodeid != 0 && entry->d_inode) {
+		mutex_lock(&entry->d_inode->i_mutex);
+		if (get_node_id(entry->d_inode) != child_nodeid) {
+			err = -ENOENT;
+			goto badentry;
+		}
+		if (d_mountpoint(entry)) {
+			err = -EBUSY;
+			goto badentry;
+		}
+		if (S_ISDIR(entry->d_inode->i_mode)) {
+			shrink_dcache_parent(entry); 
+			if (!simple_empty(entry)) {
+				err = -ENOTEMPTY;
+				goto badentry;
+			}
+			entry->d_inode->i_flags |= S_DEAD;
+		}
+		dont_mount(entry);
+		clear_nlink(entry->d_inode);
+		err = 0;
+ badentry:
+ 		mutex_unlock(&entry->d_inode->i_mutex);
+		if (!err)
+			d_delete(entry);
+	} else {
+		err = 0;
+	}
 	dput(entry);
-	err = 0;
 
  unlock:
 	mutex_unlock(&parent->i_mutex);
diff -updr orig/fs/fuse/fuse_i.h new/fs/fuse/fuse_i.h
--- orig/fs/fuse/fuse_i.h	2011-12-01 23:56:01.000000000 +0100
+++ new/fs/fuse/fuse_i.h	2011-12-06 19:30:50.691000142 +0100
@@ -755,9 +755,15 @@ int fuse_reverse_inval_inode(struct supe
 /**
  * File-system tells the kernel to invalidate parent attributes and
  * the dentry matching parent/name.
+ *
+ * If the child_nodeid is non-zero and: 
+ *    - matches the inode number for the dentry matching parent/name,
+ *    - is not a mount point
+ *    - is a file or oan empty directory
+ * then the dentry is unhashed (d_delete()).
  */
 int fuse_reverse_inval_entry(struct super_block *sb, u64 parent_nodeid,
-			     struct qstr *name);
+			     u64 child_nodeid, struct qstr *name);
 
 int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
 		 bool isdir);
diff -updr orig/include/linux/fuse.h new/include/linux/fuse.h
--- orig/include/linux/fuse.h	2011-12-01 23:56:01.000000000 +0100
+++ new/include/linux/fuse.h	2011-12-06 19:31:05.686000162 +0100
@@ -283,6 +283,7 @@ enum fuse_notify_code {
 	FUSE_NOTIFY_INVAL_ENTRY = 3,
 	FUSE_NOTIFY_STORE = 4,
 	FUSE_NOTIFY_RETRIEVE = 5,
+	FUSE_NOTIFY_DELETE = 6,
 	FUSE_NOTIFY_CODE_MAX,
 };
 
@@ -605,6 +606,13 @@ struct fuse_notify_inval_entry_out {
 	__u32	namelen;
 	__u32	padding;
 };
+
+struct fuse_notify_delete_out {
+	__u64	parent;
+	__u64	child;
+	__u32	namelen;
+	__u32	padding;
+};
 
 struct fuse_notify_store_out {
 	__u64	nodeid;


------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] FUSE: Notifying the kernel of deletion.
  2011-12-06 18:50 [PATCH] FUSE: Notifying the kernel of deletion John Muir
@ 2011-12-06 20:03 ` J. Bruce Fields
  2011-12-06 20:08   ` John Muir
  0 siblings, 1 reply; 4+ messages in thread
From: J. Bruce Fields @ 2011-12-06 20:03 UTC (permalink / raw)
  To: John Muir; +Cc: fuse-devel, Miklos Szeredi, Linux-Fsdevel

On Tue, Dec 06, 2011 at 07:50:33PM +0100, John Muir wrote:
> Allows a FUSE file-system to tell the kernel when a file or directory is deleted. If the specified dentry has the specified inode number, the kernel will unhash it.
> 
> Signed-off-by: John Muir <john@jmuir.com>
> ---
> Please find below a patch that add notification of deletion to the FUSE kernel interface. These patches allow the file-system to tell the kernel when a file (and more particularly) a directory is deleted. This is needed because using the current 'notify_inval_entry' does not cause the kernel to clean up directories that are in use properly, and as a result the users of those directories see incorrect semantics from the file-system. The error condition seen when 'notify_inval_entry' is used to notify of a deleted directory is avoided when 'notify_delete' is used instead.

You've put all this extra text after the ---, meaning it would be
discarded before going into git.  But this sort of "why we're doing
this" information is exactly what you want in a changelog.

--b.

> 
> I'll demonstrate below with the following scenario:
>  1. User A chdirs into 'testdir' and starts reading 'testfile'.
>  2. User B rm -rf 'testdir'.
>  3. User B creates 'testdir'.
>  4. User C chdirs into 'testdir'.
> 
> If you run the above within the same machine on any file-system (including fuse file-systems), there is no problem: user C is able to chdir into the new testdir. The old testdir is removed from the dentry tree, but still open by user A.
> 
> If, on the other hand, the operations 2 and 3 are performed via the network such that the fuse file-system uses one of the notify functions to tell the kernel that the nodes are gone, then the following error occurs for user C while user A holds the original directory open:
> 
>  muirj@empacher:~> ls /test/testdir
>  ls: cannot access /test/testdir: No such file or directory
> 
> The issue here is that the kernel still has a dentry for testdir, and so it is requesting the attributes for the old directory, while my file-system is responding that the directory no longer exists.
> 
> If on the other hand, if the file-system can notify the kernel that the directory is deleted using the new 'notify_delete' function, then the above ls will find the new directory as expected.
> 
> 
> diff -updr orig/fs/fuse/dev.c new/fs/fuse/dev.c
> --- orig/fs/fuse/dev.c	2011-12-01 23:56:01.000000000 +0100
> +++ new/fs/fuse/dev.c	2011-12-06 19:30:50.682000123 +0100
> @@ -1378,7 +1378,59 @@ static int fuse_notify_inval_entry(struc
>  	down_read(&fc->killsb);
>  	err = -ENOENT;
>  	if (fc->sb)
> -		err = fuse_reverse_inval_entry(fc->sb, outarg.parent, &name);
> +		err = fuse_reverse_inval_entry(fc->sb, outarg.parent, 0, &name);
> +	up_read(&fc->killsb);
> +	kfree(buf);
> +	return err;
> +
> +err:
> +	kfree(buf);
> +	fuse_copy_finish(cs);
> +	return err;
> +}
> +
> +static int fuse_notify_delete(struct fuse_conn *fc, unsigned int size,
> +			      struct fuse_copy_state *cs)
> +{
> +	struct fuse_notify_delete_out outarg;
> +	int err = -ENOMEM;
> +	char *buf;
> +	struct qstr name;
> +
> +	buf = kzalloc(FUSE_NAME_MAX + 1, GFP_KERNEL);
> +	if (!buf)
> +		goto err;
> +
> +	err = -EINVAL;
> +	if (size < sizeof(outarg))
> +		goto err;
> +
> +	err = fuse_copy_one(cs, &outarg, sizeof(outarg));
> +	if (err)
> +		goto err;
> +
> +	err = -ENAMETOOLONG;
> +	if (outarg.namelen > FUSE_NAME_MAX)
> +		goto err;
> +
> +	err = -EINVAL;
> +	if (size != sizeof(outarg) + outarg.namelen + 1)
> +		goto err;
> +
> +	name.name = buf;
> +	name.len = outarg.namelen;
> +	err = fuse_copy_one(cs, buf, outarg.namelen + 1);
> +	if (err)
> +		goto err;
> +	fuse_copy_finish(cs);
> +	buf[outarg.namelen] = 0;
> +	name.hash = full_name_hash(name.name, name.len);
> +
> +	down_read(&fc->killsb);
> +	err = -ENOENT;
> +	if (fc->sb)
> +		err = fuse_reverse_inval_entry(fc->sb, outarg.parent,
> +                			       outarg.child, &name);
>  	up_read(&fc->killsb);
>  	kfree(buf);
>  	return err;
> @@ -1596,6 +1648,9 @@ static int fuse_notify(struct fuse_conn
>  	case FUSE_NOTIFY_RETRIEVE:
>  		return fuse_notify_retrieve(fc, size, cs);
>  
> +	case FUSE_NOTIFY_DELETE:
> +		return fuse_notify_delete(fc, size, cs);
> +
>  	default:
>  		fuse_copy_finish(cs);
>  		return -EINVAL;
> diff -updr orig/fs/fuse/dir.c new/fs/fuse/dir.c
> --- orig/fs/fuse/dir.c	2011-12-01 23:56:01.000000000 +0100
> +++ new/fs/fuse/dir.c	2011-12-06 19:30:50.683000127 +0100
> @@ -868,7 +868,7 @@ int fuse_update_attributes(struct inode
>  }
>  
>  int fuse_reverse_inval_entry(struct super_block *sb, u64 parent_nodeid,
> -			     struct qstr *name)
> +			     u64 child_nodeid, struct qstr *name)
>  {
>  	int err = -ENOTDIR;
>  	struct inode *parent;
> @@ -895,8 +895,36 @@ int fuse_reverse_inval_entry(struct supe
>  
>  	fuse_invalidate_attr(parent);
>  	fuse_invalidate_entry(entry);
> +
> +	if (child_nodeid != 0 && entry->d_inode) {
> +		mutex_lock(&entry->d_inode->i_mutex);
> +		if (get_node_id(entry->d_inode) != child_nodeid) {
> +			err = -ENOENT;
> +			goto badentry;
> +		}
> +		if (d_mountpoint(entry)) {
> +			err = -EBUSY;
> +			goto badentry;
> +		}
> +		if (S_ISDIR(entry->d_inode->i_mode)) {
> +			shrink_dcache_parent(entry); 
> +			if (!simple_empty(entry)) {
> +				err = -ENOTEMPTY;
> +				goto badentry;
> +			}
> +			entry->d_inode->i_flags |= S_DEAD;
> +		}
> +		dont_mount(entry);
> +		clear_nlink(entry->d_inode);
> +		err = 0;
> + badentry:
> + 		mutex_unlock(&entry->d_inode->i_mutex);
> +		if (!err)
> +			d_delete(entry);
> +	} else {
> +		err = 0;
> +	}
>  	dput(entry);
> -	err = 0;
>  
>   unlock:
>  	mutex_unlock(&parent->i_mutex);
> diff -updr orig/fs/fuse/fuse_i.h new/fs/fuse/fuse_i.h
> --- orig/fs/fuse/fuse_i.h	2011-12-01 23:56:01.000000000 +0100
> +++ new/fs/fuse/fuse_i.h	2011-12-06 19:30:50.691000142 +0100
> @@ -755,9 +755,15 @@ int fuse_reverse_inval_inode(struct supe
>  /**
>   * File-system tells the kernel to invalidate parent attributes and
>   * the dentry matching parent/name.
> + *
> + * If the child_nodeid is non-zero and: 
> + *    - matches the inode number for the dentry matching parent/name,
> + *    - is not a mount point
> + *    - is a file or oan empty directory
> + * then the dentry is unhashed (d_delete()).
>   */
>  int fuse_reverse_inval_entry(struct super_block *sb, u64 parent_nodeid,
> -			     struct qstr *name);
> +			     u64 child_nodeid, struct qstr *name);
>  
>  int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
>  		 bool isdir);
> diff -updr orig/include/linux/fuse.h new/include/linux/fuse.h
> --- orig/include/linux/fuse.h	2011-12-01 23:56:01.000000000 +0100
> +++ new/include/linux/fuse.h	2011-12-06 19:31:05.686000162 +0100
> @@ -283,6 +283,7 @@ enum fuse_notify_code {
>  	FUSE_NOTIFY_INVAL_ENTRY = 3,
>  	FUSE_NOTIFY_STORE = 4,
>  	FUSE_NOTIFY_RETRIEVE = 5,
> +	FUSE_NOTIFY_DELETE = 6,
>  	FUSE_NOTIFY_CODE_MAX,
>  };
>  
> @@ -605,6 +606,13 @@ struct fuse_notify_inval_entry_out {
>  	__u32	namelen;
>  	__u32	padding;
>  };
> +
> +struct fuse_notify_delete_out {
> +	__u64	parent;
> +	__u64	child;
> +	__u32	namelen;
> +	__u32	padding;
> +};
>  
>  struct fuse_notify_store_out {
>  	__u64	nodeid;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] FUSE: Notifying the kernel of deletion.
  2011-12-06 20:03 ` J. Bruce Fields
@ 2011-12-06 20:08   ` John Muir
  2011-12-06 20:42     ` J. Bruce Fields
  0 siblings, 1 reply; 4+ messages in thread
From: John Muir @ 2011-12-06 20:08 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: fuse-devel, Miklos Szeredi, Linux-Fsdevel

On 2011.12.06, at 21:03 , J. Bruce Fields wrote:

> On Tue, Dec 06, 2011 at 07:50:33PM +0100, John Muir wrote:
>> Allows a FUSE file-system to tell the kernel when a file or directory is deleted. If the specified dentry has the specified inode number, the kernel will unhash it.
>> 
>> Signed-off-by: John Muir <john@jmuir.com>
>> ---
>> Please find below a patch that add notification of deletion to the FUSE kernel interface. These patches allow the file-system to tell the kernel when a file (and more particularly) a directory is deleted. This is needed because using the current 'notify_inval_entry' does not cause the kernel to clean up directories that are in use properly, and as a result the users of those directories see incorrect semantics from the file-system. The error condition seen when 'notify_inval_entry' is used to notify of a deleted directory is avoided when 'notify_delete' is used instead.
> 
> You've put all this extra text after the ---, meaning it would be
> discarded before going into git.  But this sort of "why we're doing
> this" information is exactly what you want in a changelog.
> 
> --b.

I was aware of this. My intention was that the first two lines would be in the change log, given that my explanation was rather long. If it isn't too long, and the full explanation can go into the change log, I can definitely re-submit.

Cheers,

John.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] FUSE: Notifying the kernel of deletion.
  2011-12-06 20:08   ` John Muir
@ 2011-12-06 20:42     ` J. Bruce Fields
  0 siblings, 0 replies; 4+ messages in thread
From: J. Bruce Fields @ 2011-12-06 20:42 UTC (permalink / raw)
  To: John Muir; +Cc: fuse-devel, Miklos Szeredi, Linux-Fsdevel

On Tue, Dec 06, 2011 at 09:08:48PM +0100, John Muir wrote:
> On 2011.12.06, at 21:03 , J. Bruce Fields wrote:
> 
> > On Tue, Dec 06, 2011 at 07:50:33PM +0100, John Muir wrote:
> >> Allows a FUSE file-system to tell the kernel when a file or directory is deleted. If the specified dentry has the specified inode number, the kernel will unhash it.
> >> 
> >> Signed-off-by: John Muir <john@jmuir.com>
> >> ---
> >> Please find below a patch that add notification of deletion to the FUSE kernel interface. These patches allow the file-system to tell the kernel when a file (and more particularly) a directory is deleted. This is needed because using the current 'notify_inval_entry' does not cause the kernel to clean up directories that are in use properly, and as a result the users of those directories see incorrect semantics from the file-system. The error condition seen when 'notify_inval_entry' is used to notify of a deleted directory is avoided when 'notify_delete' is used instead.
> > 
> > You've put all this extra text after the ---, meaning it would be
> > discarded before going into git.  But this sort of "why we're doing
> > this" information is exactly what you want in a changelog.
> > 
> > --b.
> 
> I was aware of this. My intention was that the first two lines would be in the change log, given that my explanation was rather long. If it isn't too long, and the full explanation can go into the change log, I can definitely re-submit.

For me, sure, changelog bytes are cheap, and a summary of the patch is
nice but never as useful as "here's why we're doing it".

--b.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-12-06 20:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-06 18:50 [PATCH] FUSE: Notifying the kernel of deletion John Muir
2011-12-06 20:03 ` J. Bruce Fields
2011-12-06 20:08   ` John Muir
2011-12-06 20:42     ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).