All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: linux-arch@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [RFC] get_write_access()/deny_write_access() without inode->i_lock
Date: Mon, 20 Jun 2011 00:51:47 +0100	[thread overview]
Message-ID: <20110619235147.GQ11521@ZenIV.linux.org.uk> (raw)

	I'm seriously tempted to throw away i_lock uses in
{get,deny}_write_access(), as in the patch below.  The question is, how
badly will it suck on various architectures?  I'd expect it to be not
worse than the current version, but...
	BTW, I wonder if we need barriers in {put,allow}_write_access (in
either version).

	Related question: would it make sense to turn that into
atomic_inc_unless_negative/atomic_dec_unless_positive?  I don't
remember any code doing that kind of stuff - no idea if there are
any potential users for that.

diff --git a/fs/namei.c b/fs/namei.c
index 26bef77..7dffe2e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -341,52 +341,6 @@ ok:
 	return security_inode_exec_permission(inode, flags);
 }
 
-/*
- * get_write_access() gets write permission for a file.
- * put_write_access() releases this write permission.
- * This is used for regular files.
- * We cannot support write (and maybe mmap read-write shared) accesses and
- * MAP_DENYWRITE mmappings simultaneously. The i_writecount field of an inode
- * can have the following values:
- * 0: no writers, no VM_DENYWRITE mappings
- * < 0: (-i_writecount) vm_area_structs with VM_DENYWRITE set exist
- * > 0: (i_writecount) users are writing to the file.
- *
- * Normally we operate on that counter with atomic_{inc,dec} and it's safe
- * except for the cases where we don't hold i_writecount yet. Then we need to
- * use {get,deny}_write_access() - these functions check the sign and refuse
- * to do the change if sign is wrong. Exclusion between them is provided by
- * the inode->i_lock spinlock.
- */
-
-int get_write_access(struct inode * inode)
-{
-	spin_lock(&inode->i_lock);
-	if (atomic_read(&inode->i_writecount) < 0) {
-		spin_unlock(&inode->i_lock);
-		return -ETXTBSY;
-	}
-	atomic_inc(&inode->i_writecount);
-	spin_unlock(&inode->i_lock);
-
-	return 0;
-}
-
-int deny_write_access(struct file * file)
-{
-	struct inode *inode = file->f_path.dentry->d_inode;
-
-	spin_lock(&inode->i_lock);
-	if (atomic_read(&inode->i_writecount) > 0) {
-		spin_unlock(&inode->i_lock);
-		return -ETXTBSY;
-	}
-	atomic_dec(&inode->i_writecount);
-	spin_unlock(&inode->i_lock);
-
-	return 0;
-}
-
 /**
  * path_get - get a reference to a path
  * @path: path to get the reference to
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7302e44..ab89aa3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2194,8 +2194,43 @@ static inline bool execute_ok(struct inode *inode)
 	return (inode->i_mode & S_IXUGO) || S_ISDIR(inode->i_mode);
 }
 
-extern int get_write_access(struct inode *);
-extern int deny_write_access(struct file *);
+/*
+ * get_write_access() gets write permission for a file.
+ * put_write_access() releases this write permission.
+ * This is used for regular files.
+ * We cannot support write (and maybe mmap read-write shared) accesses and
+ * MAP_DENYWRITE mmappings simultaneously. The i_writecount field of an inode
+ * can have the following values:
+ * 0: no writers, no VM_DENYWRITE mappings
+ * < 0: (-i_writecount) vm_area_structs with VM_DENYWRITE set exist
+ * > 0: (i_writecount) users are writing to the file.
+ *
+ * Normally we operate on that counter with atomic_{inc,dec} and it's safe
+ * except for the cases where we don't hold i_writecount yet. Then we need to
+ * use {get,deny}_write_access() - these functions check the sign and refuse
+ * to do the change if sign is wrong.
+ */
+static inline int get_write_access(struct inode *inode)
+{
+	int v, v1;
+	for (v = atomic_read(&inode->i_writecount); v >= 0; v = v1) {
+		v1 = atomic_cmpxchg(&inode->i_writecount, v, v + 1);
+		if (likely(v1 == v))
+			return 0;
+	}
+	return -ETXTBSY;
+}
+static inline int deny_write_access(struct file *file)
+{
+	struct inode *inode = file->f_path.dentry->d_inode;
+	int v, v1;
+	for (v = atomic_read(&inode->i_writecount); v <= 0; v = v1) {
+		v1 = atomic_cmpxchg(&inode->i_writecount, v, v + 1);
+		if (likely(v1 == v))
+			return 0;
+	}
+	return -ETXTBSY;
+}
 static inline void put_write_access(struct inode * inode)
 {
 	atomic_dec(&inode->i_writecount);

             reply	other threads:[~2011-06-19 23:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-19 23:51 Al Viro [this message]
2011-06-20 12:47 ` [RFC] get_write_access()/deny_write_access() without inode->i_lock David Howells
2011-06-20 13:18   ` Al Viro
2011-06-20 13:20   ` David Howells
2011-06-20 13:20     ` David Howells
2011-06-20 13:21 ` Frantisek Hrbata
2011-06-20 14:15   ` Al Viro
2011-06-20 15:55 ` Linus Torvalds
2011-06-20 15:55   ` Linus Torvalds
2011-06-20 16:13   ` Al Viro
2011-06-20 16:22     ` Linus Torvalds
2011-06-20 16:42       ` Al Viro
2011-06-20 17:03         ` Al Viro
2011-06-20 19:47 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110619235147.GQ11521@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.