public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 1/2] pin kern mounts as writable
@ 2009-08-03 21:59 Dave Hansen
  2009-08-03 21:59 ` [RFC][PATCH 2/2] fix mnt_want_write_file() on special files Dave Hansen
  2009-09-12 13:42 ` [RFC][PATCH 1/2] pin kern mounts as writable Al Viro
  0 siblings, 2 replies; 3+ messages in thread
From: Dave Hansen @ 2009-08-03 21:59 UTC (permalink / raw)
  To: Al Viro; +Cc: Nick Piggin, linux-kernel, OGAWA Hirofumi, Dave Hansen


If we are going to continue to use mnt_clone_write() inside
of init_file(), then we're going to need some kind of extra
handling.

What I want to do in the next patch is add a debugging check
in mnt_clone_write() to double-check that there *is* a real
writer on the mount before mnt_clone_write() succeeds.  To
do that, we either need to check for MS_KERNMOUNT around
that debug check, or we need to make sure that these kern
mounts *have* a write already.

I'm choosing to make sure they always have a write.

---

 linux-2.6.git-dave/fs/super.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff -puN fs/super.c~pin-mnt-writable fs/super.c
--- linux-2.6.git/fs/super.c~pin-mnt-writable	2009-08-03 14:47:29.000000000 -0700
+++ linux-2.6.git-dave/fs/super.c	2009-08-03 14:49:49.000000000 -0700
@@ -948,7 +948,19 @@ EXPORT_SYMBOL_GPL(do_kern_mount);
 
 struct vfsmount *kern_mount_data(struct file_system_type *type, void *data)
 {
-	return vfs_kern_mount(type, MS_KERNMOUNT, type->name, data);
+	int err = 0;
+	struct vfsmount *mnt;
+	mnt = vfs_kern_mount(type, MS_KERNMOUNT, type->name, data);
+	if (IS_ERR(mnt))
+		return mnt;
+	/*
+	 * We will never allow this mount to be r/o.  Doing
+	 * this makes it explicit and allows mnt_clone_write()
+	 * to be used unconditionally on this mount.
+	 */
+	err = mnt_want_write(mnt);
+	WARN_ON(err);
+	return mnt;
 }
 
 EXPORT_SYMBOL_GPL(kern_mount_data);
diff -puN include/linux/mount.h~pin-mnt-writable include/linux/mount.h
_

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC][PATCH 2/2] fix mnt_want_write_file() on special files
  2009-08-03 21:59 [RFC][PATCH 1/2] pin kern mounts as writable Dave Hansen
@ 2009-08-03 21:59 ` Dave Hansen
  2009-09-12 13:42 ` [RFC][PATCH 1/2] pin kern mounts as writable Al Viro
  1 sibling, 0 replies; 3+ messages in thread
From: Dave Hansen @ 2009-08-03 21:59 UTC (permalink / raw)
  To: Al Viro; +Cc: Nick Piggin, linux-kernel, OGAWA Hirofumi, Dave Hansen


mnt_want_write_file() uses the basic assumption that
we can use a refernce to a 'struct file' with
FMODE_WRITE set in lieu of all of the expensive checks
to avoid remount,ro races.

The problem is that FMODE_WRITE is not enough.  Special
files never had a mnt_want_write() done for them, so
we have to exclude them.

This also adds a commented-out BUG_ON() that will
reliably detect if anyone tries this again.  However,
it comes at the cost of destroying any and all
performance gains that mnt_clone_write() would have
offered (and then some).

---

 linux-2.6.git-dave/fs/namespace.c |   24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff -puN fs/namespace.c~mnt_want_write_file-0 fs/namespace.c
--- linux-2.6.git/fs/namespace.c~mnt_want_write_file-0	2009-08-03 14:51:51.000000000 -0700
+++ linux-2.6.git-dave/fs/namespace.c	2009-08-03 14:52:39.000000000 -0700
@@ -294,9 +294,17 @@ EXPORT_SYMBOL_GPL(mnt_want_write);
  *
  * After finished, mnt_drop_write must be called as usual to
  * drop the reference.
+ *
+ * Be very careful using this.  You must *guarantee* that
+ * this vfsmount has at least one existing, persistent writer
+ * that can not possibly go away, before calling this.
  */
 int mnt_clone_write(struct vfsmount *mnt)
 {
+	/* This would kill the performance
+	 * optimization in this function
+	BUG_ON(count_mnt_writers(mnt) <= 0);
+	 */
 	/* superblock may be r/o */
 	if (__mnt_is_readonly(mnt))
 		return -EROFS;
@@ -312,14 +320,20 @@ EXPORT_SYMBOL_GPL(mnt_clone_write);
  * @file: the file who's mount on which to take a write
  *
  * This is like mnt_want_write, but it takes a file and can
- * do some optimisations if the file is open for write already
+ * do some optimisations if the file is open for write already.
+ * We do not do mnt_want_write() on read-only or special files,
+ * so we can not use mnt_clone_write() for them.
  */
 int mnt_want_write_file(struct file *file)
 {
-	if (!(file->f_mode & FMODE_WRITE))
-		return mnt_want_write(file->f_path.mnt);
-	else
-		return mnt_clone_write(file->f_path.mnt);
+	struct path *path = &file->f_path;
+	struct inode *inode = path->dentry->d_inode;
+
+	if ((file->f_mode & FMODE_WRITE) &&
+	    !special_file(inode->i_mode))
+		return mnt_clone_write(path->mnt);
+
+	return mnt_want_write(path->mnt);
 }
 EXPORT_SYMBOL_GPL(mnt_want_write_file);
 
diff -puN ./lib/Kconfig.debug~mnt_want_write_file-0 ./lib/Kconfig.debug
_

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC][PATCH 1/2] pin kern mounts as writable
  2009-08-03 21:59 [RFC][PATCH 1/2] pin kern mounts as writable Dave Hansen
  2009-08-03 21:59 ` [RFC][PATCH 2/2] fix mnt_want_write_file() on special files Dave Hansen
@ 2009-09-12 13:42 ` Al Viro
  1 sibling, 0 replies; 3+ messages in thread
From: Al Viro @ 2009-09-12 13:42 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Nick Piggin, linux-kernel, OGAWA Hirofumi

On Mon, Aug 03, 2009 at 02:59:40PM -0700, Dave Hansen wrote:

> If we are going to continue to use mnt_clone_write() inside
> of init_file(), then we're going to need some kind of extra
> handling.
> 
> What I want to do in the next patch is add a debugging check
> in mnt_clone_write() to double-check that there *is* a real
> writer on the mount before mnt_clone_write() succeeds.  To
> do that, we either need to check for MS_KERNMOUNT around
> that debug check, or we need to make sure that these kern
> mounts *have* a write already.
> 
> I'm choosing to make sure they always have a write.

> +	err = mnt_want_write(mnt);
> +	WARN_ON(err);

And what would happen to that when we do it to a filesystem that is
made read-only by its ->get_sb()?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-09-12 13:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-03 21:59 [RFC][PATCH 1/2] pin kern mounts as writable Dave Hansen
2009-08-03 21:59 ` [RFC][PATCH 2/2] fix mnt_want_write_file() on special files Dave Hansen
2009-09-12 13:42 ` [RFC][PATCH 1/2] pin kern mounts as writable Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox