public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* missing mnt_drop_write() on open error
@ 2007-09-25 23:14 Miklos Szeredi
  2007-09-25 23:24 ` Dave Hansen
  2007-09-26  1:21 ` Dave Hansen
  0 siblings, 2 replies; 10+ messages in thread
From: Miklos Szeredi @ 2007-09-25 23:14 UTC (permalink / raw)
  To: haveblue; +Cc: akpm, linux-kernel

I get this at umount, if there was a failed open():

WARNING: at fs/namespace.c:586 __mntput()

I think the problem is that may_open() calls mnt_want_write(), but if
open doesn't succeed, mnt_drop_write() will not be called.

Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-25 23:14 missing mnt_drop_write() on open error Miklos Szeredi
@ 2007-09-25 23:24 ` Dave Hansen
  2007-09-26  1:21 ` Dave Hansen
  1 sibling, 0 replies; 10+ messages in thread
From: Dave Hansen @ 2007-09-25 23:24 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, linux-kernel

On Wed, 2007-09-26 at 01:14 +0200, Miklos Szeredi wrote:
> I get this at umount, if there was a failed open():
> 
> WARNING: at fs/namespace.c:586 __mntput()
> 
> I think the problem is that may_open() calls mnt_want_write(), but if
> open doesn't succeed, mnt_drop_write() will not be called.

I see what you're talking about, and it does look like a bug to me.  I'm
working on establishing where the 'struct file' gets created in relation
to all of this, and where best we should back out the write count.

I should have a patch in a few.

-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-25 23:14 missing mnt_drop_write() on open error Miklos Szeredi
  2007-09-25 23:24 ` Dave Hansen
@ 2007-09-26  1:21 ` Dave Hansen
  2007-09-26  8:38   ` Miklos Szeredi
  1 sibling, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2007-09-26  1:21 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, linux-kernel

On Wed, 2007-09-26 at 01:14 +0200, Miklos Szeredi wrote:
> I get this at umount, if there was a failed open():
> 
> WARNING: at fs/namespace.c:586 __mntput()
> 
> I think the problem is that may_open() calls mnt_want_write(), but if
> open doesn't succeed, mnt_drop_write() will not be called.

Does this help?  I'm also thinking that we should change the open_namei*
functions to simply return 'struct file *'.  Those are the only users
other than NFS, and forcing the return of a file like that will force
users to do the fput() on it if they don't want it any more.  We'd just
need to make sure no new may_open() users pop up.  Any thoughts on that?

--

The r/o bind mount patches have an error in them:
they unconditionally take a mnt_want_write() on
all may_open() requests for writeable files.  They
mistakenly do not mnt_drop_write() if may_open()
encounters an error.

---

 lxc-dave/fs/namei.c |   39 ++++++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff -puN fs/namei.c~drop-write-if-may_open-fails fs/namei.c
--- lxc/fs/namei.c~drop-write-if-may_open-fails	2007-09-25 18:16:59.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-09-25 18:16:59.000000000 -0700
@@ -1594,11 +1594,23 @@ int vfs_create(struct inode *dir, struct
 	return error;
 }
 
+/*
+ * If you call this with FMODE_WRITE set in flag,
+ * and get a successful return, you will have
+ * acquired a write on nd->mnt.
+ *
+ * Basically, if you use this function, you either
+ * need to fix up that write count manually, or just
+ * return a 'struct file' from your function.  When
+ * you __fput() that 'struct file' it will get fixed
+ * up automatically.
+ */
 int may_open(struct nameidata *nd, int acc_mode, int flag)
 {
 	struct dentry *dentry = nd->dentry;
 	struct inode *inode = dentry->d_inode;
 	int error;
+	int took_mnt_write = 0;
 
 	if (!inode)
 		return -ENOENT;
@@ -1624,42 +1636,48 @@ int may_open(struct nameidata *nd, int a
 	} else if (flag & FMODE_WRITE) {
 		/*
 		 * effectively: !special_file()
-		 * balanced by __fput()
+		 * Balanced by __fput() and
+		 * released on error at the end
+		 * of this function.
 		 */
 		error = mnt_want_write(nd->mnt);
 		if (error)
 			return error;
+		took_mnt_write = 1;
 	}
 
 	error = vfs_permission(nd, acc_mode);
 	if (error)
-		return error;
+		goto out_err;
 	/*
 	 * An append-only file must be opened in append mode for writing.
 	 */
 	if (IS_APPEND(inode)) {
+		error = -EPERM;
 		if  ((flag & FMODE_WRITE) && !(flag & O_APPEND))
-			return -EPERM;
+			goto out_err;
+		error = -EPERM;
 		if (flag & O_TRUNC)
-			return -EPERM;
+			goto out_err;
 	}
 
 	/* O_NOATIME can only be set by the owner or superuser */
+	error = -EPERM;
 	if (flag & O_NOATIME)
 		if (!is_owner_or_cap(inode))
-			return -EPERM;
+			goto out_err;
 
 	/*
 	 * Ensure there are no outstanding leases on the file.
 	 */
 	error = break_lease(inode, flag);
 	if (error)
-		return error;
+		goto out_err;
 
 	if (flag & O_TRUNC) {
 		error = get_write_access(inode);
 		if (error)
-			return error;
+			goto out_err;
 
 		/*
 		 * Refuse to truncate files with mandatory locks held on them.
@@ -1672,12 +1690,15 @@ int may_open(struct nameidata *nd, int a
 		}
 		put_write_access(inode);
 		if (error)
-			return error;
+			goto out_err;
 	} else
 		if (flag & FMODE_WRITE)
 			DQUOT_INIT(inode);
 
-	return 0;
+out_err:
+	if (error && took_mnt_write)
+		mnt_drop_write(nd->mnt);
+	return error;
 }
 
 static int open_namei_create(struct nameidata *nd, struct path *path,

-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26  1:21 ` Dave Hansen
@ 2007-09-26  8:38   ` Miklos Szeredi
  2007-09-26 15:07     ` Christoph Hellwig
                       ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Miklos Szeredi @ 2007-09-26  8:38 UTC (permalink / raw)
  To: haveblue; +Cc: miklos, akpm, linux-kernel

> On Wed, 2007-09-26 at 01:14 +0200, Miklos Szeredi wrote:
> > I get this at umount, if there was a failed open():
> > 
> > WARNING: at fs/namespace.c:586 __mntput()
> > 
> > I think the problem is that may_open() calls mnt_want_write(), but if
> > open doesn't succeed, mnt_drop_write() will not be called.
> 
> Does this help?

It didn't fix it for me, but the patch looks OK.

In __dentry_open() there's still a few places where fput() won't be
called, notably when ->open fails, which is what I'm triggering I
think.

Also even more horrible things can happen because of the
nd->intent.open.file thing.  For example if the lookup routine calls
lookup_instantiate_filp(), and after this, but before may_open() some
error happens, then release_open_intent() will call fput() on the
file, which will cause mnt_drop_write() to be called, even though a
matching mnt_want_write() hasn't yet been called.  Ugly, eh?

> I'm also thinking that we should change the open_namei*
> functions to simply return 'struct file *'.  Those are the only users
> other than NFS, and forcing the return of a file like that will force
> users to do the fput() on it if they don't want it any more.  We'd just
> need to make sure no new may_open() users pop up.  Any thoughts on that?

Yeah, something needs to be done with open, because currently it's way
too convoluted.

Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26  8:38   ` Miklos Szeredi
@ 2007-09-26 15:07     ` Christoph Hellwig
  2007-09-26 16:14     ` Dave Hansen
  2007-09-26 16:19     ` Dave Hansen
  2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2007-09-26 15:07 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: haveblue, akpm, linux-kernel

On Wed, Sep 26, 2007 at 10:38:22AM +0200, Miklos Szeredi wrote:
> Also even more horrible things can happen because of the
> nd->intent.open.file thing.  For example if the lookup routine calls
> lookup_instantiate_filp(), and after this, but before may_open() some
> error happens, then release_open_intent() will call fput() on the
> file, which will cause mnt_drop_write() to be called, even though a
> matching mnt_want_write() hasn't yet been called.  Ugly, eh?

It's more than horrible.  I wouldn't mind using the word utter crap
for it.  It's defintively on my todo list to get rid of this junk
again.  It managed to get in without proper review unfortunately.

> 
> > I'm also thinking that we should change the open_namei*
> > functions to simply return 'struct file *'.  Those are the only users
> > other than NFS, and forcing the return of a file like that will force
> > users to do the fput() on it if they don't want it any more.  We'd just
> > need to make sure no new may_open() users pop up.  Any thoughts on that?
> 
> Yeah, something needs to be done with open, because currently it's way
> too convoluted.

I have some changes for open_namei pending and I'll see if incorporating
this makes sense.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26  8:38   ` Miklos Szeredi
  2007-09-26 15:07     ` Christoph Hellwig
@ 2007-09-26 16:14     ` Dave Hansen
  2007-09-26 16:19     ` Dave Hansen
  2 siblings, 0 replies; 10+ messages in thread
From: Dave Hansen @ 2007-09-26 16:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, linux-kernel, Christoph Hellwig

On Wed, 2007-09-26 at 10:38 +0200, Miklos Szeredi wrote: 
> > On Wed, 2007-09-26 at 01:14 +0200, Miklos Szeredi wrote:
> > > I get this at umount, if there was a failed open():
> > > 
> > > WARNING: at fs/namespace.c:586 __mntput()
> > > 
> > > I think the problem is that may_open() calls mnt_want_write(), but if
> > > open doesn't succeed, mnt_drop_write() will not be called.
> > 
> > Does this help?
> 
> It didn't fix it for me, but the patch looks OK.
> 
> In __dentry_open() there's still a few places where fput() won't be
> called, notably when ->open fails, which is what I'm triggering I
> think.
>
> Also even more horrible things can happen because of the
> nd->intent.open.file thing.  For example if the lookup routine calls
> lookup_instantiate_filp(), and after this, but before may_open() some
> error happens, then release_open_intent() will call fput() on the
> file, which will cause mnt_drop_write() to be called, even though a
> matching mnt_want_write() hasn't yet been called.  Ugly, eh?

I used to have a patch that didn't completely trust that all files with
FMODE_WRITE set to have taken a write on the mnt.  I think I used a flag
to indicate whether or not a particular file had a mnt_want_write() done
on its behalf.  It somewhat artificially keeps the mnt write count
balanced, but I think it will let us detect when things like this go on.

> > I'm also thinking that we should change the open_namei*
> > functions to simply return 'struct file *'.  Those are the only users
> > other than NFS, and forcing the return of a file like that will force
> > users to do the fput() on it if they don't want it any more.  We'd just
> > need to make sure no new may_open() users pop up.  Any thoughts on that?
> 
> Yeah, something needs to be done with open, because currently it's way
> too convoluted.

Sounds like Christoph has some ideas...

-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26  8:38   ` Miklos Szeredi
  2007-09-26 15:07     ` Christoph Hellwig
  2007-09-26 16:14     ` Dave Hansen
@ 2007-09-26 16:19     ` Dave Hansen
  2007-09-26 17:50       ` Miklos Szeredi
  2 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2007-09-26 16:19 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, linux-kernel, hch

On Wed, 2007-09-26 at 10:38 +0200, Miklos Szeredi wrote:
> In __dentry_open() there's still a few places where fput() won't be
> called, notably when ->open fails, which is what I'm triggering I
> think.
> 
> Also even more horrible things can happen because of the
> nd->intent.open.file thing.  For example if the lookup routine calls
> lookup_instantiate_filp(), and after this, but before may_open() some
> error happens, then release_open_intent() will call fput() on the
> file, which will cause mnt_drop_write() to be called, even though a
> matching mnt_want_write() hasn't yet been called.  Ugly, eh? 

I'm not sure it is _that_ horrible. ;)

Do you see any reason we can't just shadow the
get/put_write_access(inode) calls with mnt_want/drop_write() calls?  I
think they're always matched.

-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26 16:19     ` Dave Hansen
@ 2007-09-26 17:50       ` Miklos Szeredi
  2007-09-26 18:08         ` Dave Hansen
  0 siblings, 1 reply; 10+ messages in thread
From: Miklos Szeredi @ 2007-09-26 17:50 UTC (permalink / raw)
  To: haveblue; +Cc: miklos, akpm, linux-kernel, hch

> On Wed, 2007-09-26 at 10:38 +0200, Miklos Szeredi wrote:
> > In __dentry_open() there's still a few places where fput() won't be
> > called, notably when ->open fails, which is what I'm triggering I
> > think.
> > 
> > Also even more horrible things can happen because of the
> > nd->intent.open.file thing.  For example if the lookup routine calls
> > lookup_instantiate_filp(), and after this, but before may_open() some
> > error happens, then release_open_intent() will call fput() on the
> > file, which will cause mnt_drop_write() to be called, even though a
> > matching mnt_want_write() hasn't yet been called.  Ugly, eh? 
> 
> I'm not sure it is _that_ horrible. ;)
> 
> Do you see any reason we can't just shadow the
> get/put_write_access(inode) calls with mnt_want/drop_write() calls?  I
> think they're always matched.

Maybe.  Can we do the mnt_want_write() from __dentry_open(), instead
of may_open()?  That would be a lot cleaner.

Btw, may_open() doesn't do mnt_want_write() around the truncation if
file is opened with O_TRUNC | O_RDONLY.

Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26 17:50       ` Miklos Szeredi
@ 2007-09-26 18:08         ` Dave Hansen
  2007-09-26 18:54           ` Miklos Szeredi
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2007-09-26 18:08 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, linux-kernel, hch

On Wed, 2007-09-26 at 19:50 +0200, Miklos Szeredi wrote:
> Maybe.  Can we do the mnt_want_write() from __dentry_open(), instead
> of may_open()?  That would be a lot cleaner.

I'll explore that.  It may make very good sense.

> Btw, may_open() doesn't do mnt_want_write() around the truncation if
> file is opened with O_TRUNC | O_RDONLY.

What's the path to may_open() in that case?  open_namei() should wrap
all callers other than nfs, and it does:

	/* O_TRUNC implies we need access checks for write permissions */
        if (flag & O_TRUNC)
                acc_mode |= MAY_WRITE;

Which should trigger the may_open() code.  

later in open_namei():
...
ok:
        error = may_open(nd, acc_mode, flag);
        if (error)
                goto exit;
        return 0;


-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: missing mnt_drop_write() on open error
  2007-09-26 18:08         ` Dave Hansen
@ 2007-09-26 18:54           ` Miklos Szeredi
  0 siblings, 0 replies; 10+ messages in thread
From: Miklos Szeredi @ 2007-09-26 18:54 UTC (permalink / raw)
  To: haveblue; +Cc: miklos, akpm, linux-kernel, hch

> > Btw, may_open() doesn't do mnt_want_write() around the truncation if
> > file is opened with O_TRUNC | O_RDONLY.
> 
> What's the path to may_open() in that case?  open_namei() should wrap
> all callers other than nfs, and it does:
> 
> 	/* O_TRUNC implies we need access checks for write permissions */
>         if (flag & O_TRUNC)
>                 acc_mode |= MAY_WRITE;
> 
> Which should trigger the may_open() code.  

Ah, I missed that.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-09-26 18:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-25 23:14 missing mnt_drop_write() on open error Miklos Szeredi
2007-09-25 23:24 ` Dave Hansen
2007-09-26  1:21 ` Dave Hansen
2007-09-26  8:38   ` Miklos Szeredi
2007-09-26 15:07     ` Christoph Hellwig
2007-09-26 16:14     ` Dave Hansen
2007-09-26 16:19     ` Dave Hansen
2007-09-26 17:50       ` Miklos Szeredi
2007-09-26 18:08         ` Dave Hansen
2007-09-26 18:54           ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox