linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install
@ 2015-04-16 12:16 Mateusz Guzik
  2015-04-16 17:47 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Mateusz Guzik @ 2015-04-16 12:16 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Andrew Morton, Paul E. McKenney, Yann Droneaud,
	Konstantin Khlebnikov, linux-fsdevel, linux-kernel

Hi,

Currently obtaining a new file descriptor results in locking fdtable
twice - once in order to reserve a slot and second time to fill it.

Hack below gets rid of the second lock usage.

It gives me a ~30% speedup (~300k ops -> ~400k ops) in a microbenchmark
where 16 threads create a pipe (2 fds) and call 2 * close.

Results are fluctuating and even with the patch sometimes drop to around
~300k ops. However, without the patch they never get higher.

I can come up with a better benchmark if that's necessary.

Comments?

==============================================

Subject: [PATCH] fs: use a sequence counter instead of file_lock in fd_install

Because the lock is not held, it is possible that fdtable will be
reallocated as we fill it.

RCU is used to guarantee the old table is not freed just in case we
happen to write to it (which is harmless).

sequence counter is used to ensure we updated the right table.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
---
 fs/file.c               | 24 +++++++++++++++++++-----
 include/linux/fdtable.h |  5 +++++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 93c5f89..bd1ef4c 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -165,8 +165,10 @@ static int expand_fdtable(struct files_struct *files, int nr)
 	cur_fdt = files_fdtable(files);
 	if (nr >= cur_fdt->max_fds) {
 		/* Continue as planned */
+		write_seqcount_begin(&files->fdt_seqcount);
 		copy_fdtable(new_fdt, cur_fdt);
 		rcu_assign_pointer(files->fdt, new_fdt);
+		write_seqcount_end(&files->fdt_seqcount);
 		if (cur_fdt != &files->fdtab)
 			call_rcu(&cur_fdt->rcu, free_fdtable_rcu);
 	} else {
@@ -256,6 +258,7 @@ struct files_struct *dup_fd(struct files_struct *oldf, int *errorp)
 	atomic_set(&newf->count, 1);
 
 	spin_lock_init(&newf->file_lock);
+	seqcount_init(&newf->fdt_seqcount);
 	newf->next_fd = 0;
 	new_fdt = &newf->fdtab;
 	new_fdt->max_fds = NR_OPEN_DEFAULT;
@@ -429,6 +432,7 @@ void exit_files(struct task_struct *tsk)
 
 struct files_struct init_files = {
 	.count		= ATOMIC_INIT(1),
+	.fdt_seqcount	= SEQCNT_ZERO(fdt_seqcount),
 	.fdt		= &init_files.fdtab,
 	.fdtab		= {
 		.max_fds	= NR_OPEN_DEFAULT,
@@ -552,12 +556,22 @@ EXPORT_SYMBOL(put_unused_fd);
 void __fd_install(struct files_struct *files, unsigned int fd,
 		struct file *file)
 {
+	unsigned long seq;
 	struct fdtable *fdt;
-	spin_lock(&files->file_lock);
-	fdt = files_fdtable(files);
-	BUG_ON(fdt->fd[fd] != NULL);
-	rcu_assign_pointer(fdt->fd[fd], file);
-	spin_unlock(&files->file_lock);
+
+	rcu_read_lock();
+	do {
+		seq = read_seqcount_begin(&files->fdt_seqcount);
+		fdt = files_fdtable_seq(files);
+		/*
+		 * Entry in the table can already be equal to file if we
+		 * had to restart and copy_fdtable picked up our update.
+		 */
+		BUG_ON(!(fdt->fd[fd] == NULL || fdt->fd[fd] == file));
+		rcu_assign_pointer(fdt->fd[fd], file);
+		smp_mb();
+	} while (__read_seqcount_retry(&files->fdt_seqcount, seq));
+	rcu_read_unlock();
 }
 
 void fd_install(unsigned int fd, struct file *file)
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 230f87b..9e41765 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 #include <linux/init.h>
 #include <linux/fs.h>
+#include <linux/seqlock.h>
 
 #include <linux/atomic.h>
 
@@ -47,6 +48,7 @@ struct files_struct {
    * read mostly part
    */
 	atomic_t count;
+	seqcount_t fdt_seqcount;
 	struct fdtable __rcu *fdt;
 	struct fdtable fdtab;
   /*
@@ -69,6 +71,9 @@ struct dentry;
 #define files_fdtable(files) \
 	rcu_dereference_check_fdtable((files), (files)->fdt)
 
+#define files_fdtable_seq(files) \
+	rcu_dereference((files)->fdt)
+
 /*
  * The caller must ensure that fd table isn't shared or hold rcu or file lock
  */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-06-30 13:54 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-16 12:16 [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-16 17:47 ` Eric Dumazet
2015-04-16 18:09 ` Al Viro
2015-04-16 20:42   ` Eric Dumazet
2015-04-16 20:55     ` Eric Dumazet
2015-04-16 22:00       ` Mateusz Guzik
2015-04-16 22:52         ` Eric Dumazet
2015-04-16 22:35   ` Mateusz Guzik
2015-04-17 21:46 ` Eric Dumazet
2015-04-17 22:16   ` Mateusz Guzik
2015-04-17 23:02     ` Al Viro
2015-04-18 19:41       ` Eric Dumazet
2015-04-20 13:41         ` Mateusz Guzik
2015-04-20 16:46           ` Eric Dumazet
2015-04-20 16:48             ` Eric Dumazet
2015-04-20 13:06       ` Mateusz Guzik
2015-04-20 13:43         ` Mateusz Guzik
2015-04-20 15:10           ` Mateusz Guzik
2015-04-20 17:15             ` Eric Dumazet
2015-04-20 20:49               ` Eric Dumazet
2015-04-21 18:05                 ` Eric Dumazet
2015-04-21 20:06                   ` Mateusz Guzik
2015-04-21 20:12                     ` Mateusz Guzik
2015-04-21 21:06                       ` Eric Dumazet
2015-04-22  4:59                         ` [PATCH] fs/file.c: don't acquire files->file_lock in fd_install() Eric Dumazet
2015-04-27 19:05                           ` Mateusz Guzik
2015-04-28 16:20                             ` Eric Dumazet
2015-04-29  4:25                           ` [PATCH v2] " Eric Dumazet
2015-06-22  2:32                             ` Al Viro
2015-06-23  5:31                               ` Eric Dumazet
2015-06-30 13:54                             ` [PATCH v3] " Eric Dumazet
2015-04-22 13:31                         ` [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-22 13:55                           ` Eric Dumazet
2015-04-21 20:57                     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).