public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* patch for allowing msdos/vfat nfs exports
@ 2001-07-23 23:55 Nathan Laredo
  2001-07-24  1:12 ` Alexander Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Laredo @ 2001-07-23 23:55 UTC (permalink / raw)
  To: linux-kernel

Linus was nice enough to change the kernel this morning for me
to allow msdos/vfat NFS exports.  Since this is totally new code
and there is no guarantee it won't destroy everything on your
fat filesystem he asked me to post the diff to the kernel mailing
list so more people could look at it/test it out before it goes
back into the standard kernel.

The patch below is relative to Linus' current working tree
(which should be the same 2.4.7 in theory).

I've been using it for half a day now and so far it hasn't done
anything bad, but please be careful if you decide to test it and
backup your data and after testing, be sure to compare your data
to your backup.

Note: this patch will NOT work for older kernels.  dentry_to_fh and
fh_to_dentry are provided (just as in reiserfs).

Please send me email directly with your experiences using this patch
since I offered to collect them for Linus.

-- Nathan Laredo
nlaredo@transmeta.com


diff -r -u linux/fs/fat/inode.c linux-new/fs/fat/inode.c
--- linux/fs/fat/inode.c	Mon Jun 11 19:15:27 2001
+++ linux-new/fs/fat/inode.c	Mon Jul 23 14:35:41 2001
@@ -403,12 +403,60 @@
 	inode->i_nlink = fat_subdirs(inode)+2;
 }
 
+static int fat_dentry_to_fh(struct dentry *dentry, __u32 *data, int *lenp, int need_parent)
+{
+	struct inode *inode = dentry->d_inode;
+	unsigned int i_pos = MSDOS_I(inode)->i_location;
+
+	*data = i_pos;
+	*lenp = 1;
+	return 1;
+}
+
+static struct dentry *fat_fh_to_dentry(struct super_block *sb, __u32 *data, int len, int fhtype, int parent)
+{
+	struct list_head *lp;
+	struct dentry *result;
+	unsigned int i_pos = data[0];
+	struct inode *inode;
+
+	inode = fat_iget(sb, i_pos);
+
+	if (!inode)
+		return ERR_PTR(-ESTALE);
+
+	/* now to find a dentry.
+	 * If possible, get a well-connected one
+	 */
+	spin_lock(&dcache_lock);
+	for (lp = inode->i_dentry.next; lp != &inode->i_dentry ; lp=lp->next) {
+	        result = list_entry(lp,struct dentry, d_alias);
+	        if (! (result->d_flags & DCACHE_NFSD_DISCONNECTED)) {
+	                 dget_locked(result);
+	                 result->d_vfs_flags |= DCACHE_REFERENCED;
+	                 spin_unlock(&dcache_lock);
+	                 iput(inode);
+	                 return result;
+	         }
+	}
+	spin_unlock(&dcache_lock);
+	result = d_alloc_root(inode);
+	if (result == NULL) {
+	         iput(inode);
+	         return ERR_PTR(-ENOMEM);
+	}
+	result->d_flags |= DCACHE_NFSD_DISCONNECTED;
+	return result;
+}
+
 static struct super_operations fat_sops = { 
 	write_inode:	fat_write_inode,
 	delete_inode:	fat_delete_inode,
 	put_super:	fat_put_super,
 	statfs:		fat_statfs,
 	clear_inode:	fat_clear_inode,
+	fh_to_dentry:	fat_fh_to_dentry,
+	dentry_to_fh:	fat_dentry_to_fh
 };
 
 /*


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: patch for allowing msdos/vfat nfs exports
  2001-07-23 23:55 patch for allowing msdos/vfat nfs exports Nathan Laredo
@ 2001-07-24  1:12 ` Alexander Viro
  2001-07-24  1:59   ` Neil Brown
  2001-07-24 16:37   ` Linus Torvalds
  0 siblings, 2 replies; 5+ messages in thread
From: Alexander Viro @ 2001-07-24  1:12 UTC (permalink / raw)
  To: Nathan Laredo; +Cc: Linus Torvalds, linux-kernel



On Mon, 23 Jul 2001, Nathan Laredo wrote:

> +	result = d_alloc_root(inode);
> +	if (result == NULL) {
> +	         iput(inode);
> +	         return ERR_PTR(-ENOMEM);
> +	}
> +	result->d_flags |= DCACHE_NFSD_DISCONNECTED;
> +	return result;

Erm... AFAICS it got a race - think of doing that for directory when
dentry is already gone, but inode still in cache. You will get
nfsd_findparent() called and that will give funny results on FAT.

The worst thing being, it _will_ get a decently-looking inode. Inode that
will point to the same blocks as parent directory, but will be completely
independent (different location).

Notice that we will end up reading directories that might be changing
under us - struct inode is different, so exclusion simply doesn't work.

IOW, we need
	if (S_ISDIR(inode->i_mode))
		return ERR_PTR(-ESTALE);
immediately before the chunk above.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: patch for allowing msdos/vfat nfs exports
  2001-07-24  1:12 ` Alexander Viro
@ 2001-07-24  1:59   ` Neil Brown
  2001-07-24 16:37   ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2001-07-24  1:59 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Nathan Laredo, Linus Torvalds, linux-kernel

On Monday July 23, viro@math.psu.edu wrote:
> 
> 
> On Mon, 23 Jul 2001, Nathan Laredo wrote:
> 
> > +	result = d_alloc_root(inode);
> > +	if (result == NULL) {
> > +	         iput(inode);
> > +	         return ERR_PTR(-ENOMEM);
> > +	}
> > +	result->d_flags |= DCACHE_NFSD_DISCONNECTED;
> > +	return result;
> 
> Erm... AFAICS it got a race - think of doing that for directory when
> dentry is already gone, but inode still in cache. You will get
> nfsd_findparent() called and that will give funny results on FAT.
> 
> The worst thing being, it _will_ get a decently-looking inode. Inode that
> will point to the same blocks as parent directory, but will be completely
> independent (different location).
> 
> Notice that we will end up reading directories that might be changing
> under us - struct inode is different, so exclusion simply doesn't work.
> 
> IOW, we need
> 	if (S_ISDIR(inode->i_mode))
> 		return ERR_PTR(-ESTALE);
> immediately before the chunk above.


I did a patch for this about the same time that the dentry_to_fh
interface went into the superblock.
I posted it the the NFS mailing list and to everybody who had explicitly
asked about this on any of the lists that I read and asked them to try
it out and let me know if it worked.  I got precisely zero response.
I take this fairly positively, as I know people are far more likely to
complain "it doesn't work" than to acknowledge "it does", but it would
have been nice to hear something.

I was going to send it to Linus just as soon as someone said "it works
for me", but no-one did :-(.

It is somewhat more cautious that the patch that Nathan sent.  I hope
it will pass Al's inspection too.  It is not perfect, but I think it
is the best that you can do for FAT without lots of ugly work.

It should be safe against inode number re-use, and should cope with
the server restarting.

It is below.

NeilBrown



--- ./fs/fat/inode.c	2001/07/23 05:47:25	1.1
+++ ./fs/fat/inode.c	2001/07/23 05:47:28	1.2
@@ -154,15 +154,19 @@
 
 void fat_delete_inode(struct inode *inode)
 {
-	lock_kernel();
-	inode->i_size = 0;
-	fat_truncate(inode);
-	unlock_kernel();
+	if (!is_bad_inode(inode)) {
+		lock_kernel();
+		inode->i_size = 0;
+		fat_truncate(inode);
+		unlock_kernel();
+	}
 	clear_inode(inode);
 }
 
 void fat_clear_inode(struct inode *inode)
 {
+	if (is_bad_inode(inode))
+		return;
 	lock_kernel();
 	spin_lock(&fat_inode_lock);
 	fat_cache_inval_inode(inode);
@@ -372,6 +376,7 @@
 	inode->i_uid = sbi->options.fs_uid;
 	inode->i_gid = sbi->options.fs_gid;
 	inode->i_version = ++event;
+	inode->i_generation = 0;
 	inode->i_mode = (S_IRWXUGO & ~sbi->options.fs_umask) | S_IFDIR;
 	inode->i_op = sbi->dir_ops;
 	inode->i_fop = &fat_dir_operations;
@@ -403,12 +408,123 @@
 	inode->i_nlink = fat_subdirs(inode)+2;
 }
 
+/*
+ * a FAT file handle with fhtype 3 is
+ *  0/  i_ino - for fast, reliable lookup if still in the cache
+ *  1/  i_generation - to see if i_ino is still valid
+ *          bit 0 == 0 iff directory
+ *  2/  i_location - if ino has changed, but still in cache
+ *  3/  i_logstart - to semi-verify inode found at i_location
+ *  4/  parent->i_logstart - maybe used to hunt for the file on disc
+ *
+ */
+struct dentry *fat_fh_to_dentry(struct super_block *sb, __u32 *fh,
+				int len, int fhtype, int parent)
+{
+	struct inode *inode = NULL;
+	struct list_head *lp;
+	struct dentry *result;
+
+	if (fhtype != 3)
+		return NULL;
+	if (len < 5)
+		return NULL;
+	if (parent)
+		return NULL; /* We cannot find the parent,
+				It better just *be* there */
+
+	inode = iget(sb, fh[0]);
+	if (!inode || is_bad_inode(inode) ||
+	    inode->i_generation != fh[1]) {
+		if (inode) iput(inode);
+		inode = NULL;
+	}
+	if (!inode) {
+		/* try 2 - see if i_location is in F-d-c
+		 * require i_logstart to be the same
+		 * Will fail if you truncate and then re-write
+		 */
+
+		inode = fat_iget(sb, fh[2]);
+		if (inode && MSDOS_I(inode)->i_logstart != fh[3]) {
+			iput(inode);
+			inode = NULL;
+		}
+	}
+	if (!inode) {
+		/* For now, do nothing
+		 * What we could do is:
+		 * follow the file starting at fh[4], and record
+		 * the ".." entry, and the name of the fh[2] entry.
+		 * The follow the ".." file finding the next step up.
+		 * This way we build a path to the root of
+		 * the tree. If this works, we lookup the path and so
+		 * get this inode into the cache.
+		 * Finally try the fat_iget lookup again
+		 * If that fails, then weare totally out of luck
+		 * But all that is for another day
+		 */
+	}
+	if (!inode)
+		return ERR_PTR(-ESTALE);
+
+	
+	/* now to find a dentry.
+	 * If possible, get a well-connected one
+	 *
+	 * Given the way that we found the inode, it *MUST* be
+	 * well-connected, but it is easiest to just copy the
+	 * code.
+	 */
+	spin_lock(&dcache_lock);
+	for (lp = inode->i_dentry.next; lp != &inode->i_dentry ; lp=lp->next) {
+		result = list_entry(lp,struct dentry, d_alias);
+		if (! (result->d_flags & DCACHE_NFSD_DISCONNECTED)) {
+			dget_locked(result);
+			result->d_vfs_flags |= DCACHE_REFERENCED;
+			spin_unlock(&dcache_lock);
+			iput(inode);
+			return result;
+		}
+	}
+	spin_unlock(&dcache_lock);
+	result = d_alloc_root(inode);
+	if (result == NULL) {
+		iput(inode);
+		return ERR_PTR(-ENOMEM);
+	}
+	result->d_flags |= DCACHE_NFSD_DISCONNECTED;
+	return result;
+
+		
+}
+
+int fat_dentry_to_fh(struct dentry *de, __u32 *fh, int *lenp, int needparent)
+{
+	int len = *lenp;
+	struct inode *inode =  de->d_inode;
+	
+	if (len < 5)
+		return 255; /* no room */
+	*lenp = 5;
+	fh[0] = inode->i_ino;
+	fh[1] = inode->i_generation;
+	fh[2] = MSDOS_I(inode)->i_location;
+	fh[3] = MSDOS_I(inode)->i_logstart;
+	fh[4] = MSDOS_I(de->d_parent->d_inode)->i_logstart;
+	return 3;
+}
+
 static struct super_operations fat_sops = { 
 	write_inode:	fat_write_inode,
 	delete_inode:	fat_delete_inode,
 	put_super:	fat_put_super,
 	statfs:		fat_statfs,
 	clear_inode:	fat_clear_inode,
+
+	read_inode:	make_bad_inode,
+	fh_to_dentry:	fat_fh_to_dentry,
+	dentry_to_fh:	fat_dentry_to_fh,
 };
 
 /*
@@ -771,7 +887,10 @@
 	inode->i_uid = sbi->options.fs_uid;
 	inode->i_gid = sbi->options.fs_gid;
 	inode->i_version = ++event;
+	inode->i_generation = CURRENT_TIME;
+	
 	if ((de->attr & ATTR_DIR) && !IS_FREE(de->name)) {
+		inode->i_generation &= ~1;
 		inode->i_mode = MSDOS_MKMODE(de->attr,S_IRWXUGO &
 		    ~sbi->options.fs_umask) | S_IFDIR;
 		inode->i_op = sbi->dir_ops;
@@ -802,6 +921,7 @@
 			}
 		MSDOS_I(inode)->mmu_private = inode->i_size;
 	} else { /* not a directory */
+		inode->i_generation |= 1;
 		inode->i_mode = MSDOS_MKMODE(de->attr,
 		    ((IS_NOEXEC(inode) || 
 		      (sbi->options.showexec &&

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: patch for allowing msdos/vfat nfs exports
  2001-07-24  1:12 ` Alexander Viro
  2001-07-24  1:59   ` Neil Brown
@ 2001-07-24 16:37   ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2001-07-24 16:37 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Nathan Laredo, Neil Brown, linux-kernel


On Mon, 23 Jul 2001, Alexander Viro wrote:
>
> On Mon, 23 Jul 2001, Nathan Laredo wrote:
>
> > +	result = d_alloc_root(inode);
> > +	if (result == NULL) {
> > +	         iput(inode);
> > +	         return ERR_PTR(-ENOMEM);
> > +	}
> > +	result->d_flags |= DCACHE_NFSD_DISCONNECTED;
> > +	return result;
>
> Erm... AFAICS it got a race - think of doing that for directory when
> dentry is already gone, but inode still in cache. You will get
> nfsd_findparent() called and that will give funny results on FAT.

Note that the code was definitely cribbed from reiserfs, and when I stole
it I had this very strong feeling that "This really should be supported in
the VFS layer instead of hidden in the low-level filesystems".

We are, after all, working with very internal knowledge of not only the
way the dentry lists are attached to the inode, but also about the whole
DISCONNECTED thing etc.

> The worst thing being, it _will_ get a decently-looking inode. Inode that
> will point to the same blocks as parent directory, but will be completely
> independent (different location).

Absolutely. And some of the NFS-export problems won't even be avoidable at
all: FAT does not have a notion of inode versions etc, and we do not have
a way to fix that.

However, what we could do is add more sanity-testing, and for example also
add the parent inode location to the dentry_to_fh code. We have space.

The patch was a fairly quick hack. What it does show is that (a) the
dentry_to_fh approach is actually a very usable one in itself and wasn't
just an ugly reiserfs hack, and (b) FAT _can_ be exported.

The reason I asked Nathan to post the damn thing instead of just applying
it to the tree was to get these kinds of comments, and hopefully move the
VFS stuff to a VFS location, ie start exporting a vfs_inode_to_dentry()
function that reiserfs, FAT and quite possibly others can sanely share.

Al, Neil, care to work something out between you? I saw that Neil had
something already..

			Linus


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: patch for allowing msdos/vfat nfs exports
       [not found] <no.id>
@ 2001-07-24 17:51 ` Alan Cox
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Cox @ 2001-07-24 17:51 UTC (permalink / raw)
  To: Nathan Laredo; +Cc: linux-kernel

> I've been using it for half a day now and so far it hasn't done
> anything bad, but please be careful if you decide to test it and
> backup your data and after testing, be sure to compare your data
> to your backup.

Rename ?

> +	struct inode *inode = dentry->d_inode;
> +	unsigned int i_pos = MSDOS_I(inode)->i_location;

i_location is not a constant across renames or other operations, so you may
inadvertantly do I/O to completely the wrong file


The infrastructure looks great, I just don't think your handles are safe 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-07-24 17:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-07-23 23:55 patch for allowing msdos/vfat nfs exports Nathan Laredo
2001-07-24  1:12 ` Alexander Viro
2001-07-24  1:59   ` Neil Brown
2001-07-24 16:37   ` Linus Torvalds
     [not found] <no.id>
2001-07-24 17:51 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox