* [patch 1/2] fs: cleanup files_lock
@ 2010-03-16 9:44 Nick Piggin
2010-03-16 9:46 ` [patch 2/2] fs: scale files_lock Nick Piggin
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Nick Piggin @ 2010-03-16 9:44 UTC (permalink / raw)
To: Al Viro, Frank Mayhar, John Stultz, Andi Kleen, linux-fsdevel
Cc: Greg Kroah-Hartman, Alan Cox, Eric W. Biederman, Linus Torvalds
I would like to start sending bits of vfs scalability improvements to
be reviewed and hopefully merged.
I will start with files_lock. Last time this one came up, it was
criticised because some hoped to get rid of files list, and because
it didn't have enough justification of the scalability improvement.
For the first criticism, it isn't any more difficult to rip out if
we are ever able to remove files list. For the second, I have gathered
some statistics and written better justification. Andi I believe is
finding kbuild is becoming limited by files lock on larger systems.
--
fs: cleanup files_lock
Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
manipulate the per-sb files list; unexport the files_lock spinlock.
Signed-off-by: Nick Piggin <npiggin@suse.de>
---
drivers/char/pty.c | 6 +++++-
drivers/char/tty_io.c | 26 ++++++++++++++++++--------
fs/file_table.c | 42 ++++++++++++++++++------------------------
fs/open.c | 4 ++--
include/linux/fs.h | 8 +++-----
include/linux/tty.h | 1 +
security/selinux/hooks.c | 4 ++--
7 files changed, 49 insertions(+), 42 deletions(-)
Index: linux-2.6/drivers/char/pty.c
===================================================================
--- linux-2.6.orig/drivers/char/pty.c
+++ linux-2.6/drivers/char/pty.c
@@ -649,7 +649,11 @@ static int __ptmx_open(struct inode *ino
set_bit(TTY_PTY_LOCK, &tty->flags); /* LOCK THE SLAVE */
filp->private_data = tty;
- file_move(filp, &tty->tty_files);
+
+ file_sb_list_del(filp); /* __dentry_open has put it on the sb list */
+ spin_lock(&tty_files_lock);
+ list_add(&filp->f_u.fu_list, &tty->tty_files);
+ spin_unlock(&tty_files_lock);
retval = devpts_pty_new(inode, tty->link);
if (retval)
Index: linux-2.6/drivers/char/tty_io.c
===================================================================
--- linux-2.6.orig/drivers/char/tty_io.c
+++ linux-2.6/drivers/char/tty_io.c
@@ -136,6 +136,9 @@ LIST_HEAD(tty_drivers); /* linked list
DEFINE_MUTEX(tty_mutex);
EXPORT_SYMBOL(tty_mutex);
+/* Spinlock to protect the tty->tty_files list */
+DEFINE_SPINLOCK(tty_files_lock);
+
static ssize_t tty_read(struct file *, char __user *, size_t, loff_t *);
static ssize_t tty_write(struct file *, const char __user *, size_t, loff_t *);
ssize_t redirected_tty_write(struct file *, const char __user *,
@@ -234,11 +237,11 @@ static int check_tty_count(struct tty_st
struct list_head *p;
int count = 0;
- file_list_lock();
+ spin_lock(&tty_files_lock);
list_for_each(p, &tty->tty_files) {
count++;
}
- file_list_unlock();
+ spin_unlock(&tty_files_lock);
if (tty->driver->type == TTY_DRIVER_TYPE_PTY &&
tty->driver->subtype == PTY_TYPE_SLAVE &&
tty->link && tty->link->count)
@@ -517,7 +520,7 @@ static void do_tty_hangup(struct work_st
lock_kernel();
check_tty_count(tty, "do_tty_hangup");
- file_list_lock();
+ spin_lock(&tty_files_lock);
/* This breaks for file handles being sent over AF_UNIX sockets ? */
list_for_each_entry(filp, &tty->tty_files, f_u.fu_list) {
if (filp->f_op->write == redirected_tty_write)
@@ -528,7 +531,7 @@ static void do_tty_hangup(struct work_st
tty_fasync(-1, filp, 0); /* can't block */
filp->f_op = &hung_up_tty_fops;
}
- file_list_unlock();
+ spin_unlock(&tty_files_lock);
tty_ldisc_hangup(tty);
@@ -1419,9 +1422,9 @@ static void release_one_tty(struct work_
tty_driver_kref_put(driver);
module_put(driver->owner);
- file_list_lock();
+ spin_lock(&tty_files_lock);
list_del_init(&tty->tty_files);
- file_list_unlock();
+ spin_unlock(&tty_files_lock);
free_tty_struct(tty);
}
@@ -1664,7 +1667,10 @@ int tty_release(struct inode *inode, str
* - do_tty_hangup no longer sees this file descriptor as
* something that needs to be handled for hangups.
*/
- file_kill(filp);
+ spin_lock(&tty_files_lock);
+ BUG_ON(list_empty(&filp->f_u.fu_list));
+ list_del_init(&filp->f_u.fu_list);
+ spin_unlock(&tty_files_lock);
filp->private_data = NULL;
/*
@@ -1833,7 +1839,11 @@ got_driver:
}
filp->private_data = tty;
- file_move(filp, &tty->tty_files);
+ BUG_ON(list_empty(&filp->f_u.fu_list));
+ file_sb_list_del(filp); /* __dentry_open has put it on the sb list */
+ spin_lock(&tty_files_lock);
+ list_add(&filp->f_u.fu_list, &tty->tty_files);
+ spin_unlock(&tty_files_lock);
check_tty_count(tty, "tty_open");
if (tty->driver->type == TTY_DRIVER_TYPE_PTY &&
tty->driver->subtype == PTY_TYPE_MASTER)
Index: linux-2.6/fs/file_table.c
===================================================================
--- linux-2.6.orig/fs/file_table.c
+++ linux-2.6/fs/file_table.c
@@ -32,8 +32,7 @@ struct files_stat_struct files_stat = {
.max_files = NR_FILE
};
-/* public. Not pretty! */
-__cacheline_aligned_in_smp DEFINE_SPINLOCK(files_lock);
+static __cacheline_aligned_in_smp DEFINE_SPINLOCK(files_lock);
/* SLAB cache for file structures */
static struct kmem_cache *filp_cachep __read_mostly;
@@ -258,7 +257,7 @@ void __fput(struct file *file)
cdev_put(inode->i_cdev);
fops_put(file->f_op);
put_pid(file->f_owner.pid);
- file_kill(file);
+ file_sb_list_del(file);
if (file->f_mode & FMODE_WRITE)
drop_file_write_access(file);
file->f_path.dentry = NULL;
@@ -320,31 +319,29 @@ struct file *fget_light(unsigned int fd,
return file;
}
-
void put_filp(struct file *file)
{
if (atomic_long_dec_and_test(&file->f_count)) {
security_file_free(file);
- file_kill(file);
+ file_sb_list_del(file);
file_free(file);
}
}
-void file_move(struct file *file, struct list_head *list)
+void file_sb_list_add(struct file *file, struct super_block *sb)
{
- if (!list)
- return;
- file_list_lock();
- list_move(&file->f_u.fu_list, list);
- file_list_unlock();
+ spin_lock(&files_lock);
+ BUG_ON(!list_empty(&file->f_u.fu_list));
+ list_add(&file->f_u.fu_list, &sb->s_files);
+ spin_unlock(&files_lock);
}
-void file_kill(struct file *file)
+void file_sb_list_del(struct file *file)
{
if (!list_empty(&file->f_u.fu_list)) {
- file_list_lock();
+ spin_lock(&files_lock);
list_del_init(&file->f_u.fu_list);
- file_list_unlock();
+ spin_unlock(&files_lock);
}
}
@@ -353,7 +350,7 @@ int fs_may_remount_ro(struct super_block
struct file *file;
/* Check that no files are currently opened for writing. */
- file_list_lock();
+ spin_lock(&files_lock);
list_for_each_entry(file, &sb->s_files, f_u.fu_list) {
struct inode *inode = file->f_path.dentry->d_inode;
@@ -365,10 +362,10 @@ int fs_may_remount_ro(struct super_block
if (S_ISREG(inode->i_mode) && (file->f_mode & FMODE_WRITE))
goto too_bad;
}
- file_list_unlock();
+ spin_unlock(&files_lock);
return 1; /* Tis' cool bro. */
too_bad:
- file_list_unlock();
+ spin_unlock(&files_lock);
return 0;
}
@@ -384,7 +381,7 @@ void mark_files_ro(struct super_block *s
struct file *f;
retry:
- file_list_lock();
+ spin_lock(&files_lock);
list_for_each_entry(f, &sb->s_files, f_u.fu_list) {
struct vfsmount *mnt;
if (!S_ISREG(f->f_path.dentry->d_inode->i_mode))
@@ -400,16 +397,13 @@ retry:
continue;
file_release_write(f);
mnt = mntget(f->f_path.mnt);
- file_list_unlock();
- /*
- * This can sleep, so we can't hold
- * the file_list_lock() spinlock.
- */
+ /* This can sleep, so we can't hold the spinlock. */
+ spin_unlock(&files_lock);
mnt_drop_write(mnt);
mntput(mnt);
goto retry;
}
- file_list_unlock();
+ spin_unlock(&files_lock);
}
void __init files_init(unsigned long mempages)
Index: linux-2.6/fs/open.c
===================================================================
--- linux-2.6.orig/fs/open.c
+++ linux-2.6/fs/open.c
@@ -841,7 +841,7 @@ static struct file *__dentry_open(struct
f->f_path.mnt = mnt;
f->f_pos = 0;
f->f_op = fops_get(inode->i_fop);
- file_move(f, &inode->i_sb->s_files);
+ file_sb_list_add(f, inode->i_sb);
error = security_dentry_open(f, cred);
if (error)
@@ -887,7 +887,7 @@ cleanup_all:
mnt_drop_write(mnt);
}
}
- file_kill(f);
+ file_sb_list_del(f);
f->f_path.dentry = NULL;
f->f_path.mnt = NULL;
cleanup_file:
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h
+++ linux-2.6/include/linux/fs.h
@@ -948,9 +948,6 @@ struct file {
unsigned long f_mnt_write_state;
#endif
};
-extern spinlock_t files_lock;
-#define file_list_lock() spin_lock(&files_lock);
-#define file_list_unlock() spin_unlock(&files_lock);
#define get_file(x) atomic_long_inc(&(x)->f_count)
#define file_count(x) atomic_long_read(&(x)->f_count)
@@ -2181,8 +2178,8 @@ static inline void insert_inode_hash(str
__insert_inode_hash(inode, inode->i_ino);
}
-extern void file_move(struct file *f, struct list_head *list);
-extern void file_kill(struct file *f);
+extern void file_sb_list_add(struct file *f, struct super_block *sb);
+extern void file_sb_list_del(struct file *f);
#ifdef CONFIG_BLOCK
struct bio;
extern void submit_bio(int, struct bio *);
Index: linux-2.6/security/selinux/hooks.c
===================================================================
--- linux-2.6.orig/security/selinux/hooks.c
+++ linux-2.6/security/selinux/hooks.c
@@ -2241,7 +2241,7 @@ static inline void flush_unauthorized_fi
tty = get_current_tty();
if (tty) {
- file_list_lock();
+ spin_lock(&tty_files_lock);
if (!list_empty(&tty->tty_files)) {
struct inode *inode;
@@ -2257,7 +2257,7 @@ static inline void flush_unauthorized_fi
drop_tty = 1;
}
}
- file_list_unlock();
+ spin_unlock(&tty_files_lock);
tty_kref_put(tty);
}
/* Reset controlling tty. */
Index: linux-2.6/include/linux/tty.h
===================================================================
--- linux-2.6.orig/include/linux/tty.h
+++ linux-2.6/include/linux/tty.h
@@ -464,6 +464,7 @@ extern struct tty_struct *tty_pair_get_t
extern struct tty_struct *tty_pair_get_pty(struct tty_struct *tty);
extern struct mutex tty_mutex;
+extern spinlock_t tty_files_lock;
extern void tty_write_unlock(struct tty_struct *tty);
extern int tty_write_lock(struct tty_struct *tty, int ndelay);
^ permalink raw reply [flat|nested] 5+ messages in thread
* [patch 2/2] fs: scale files_lock
2010-03-16 9:44 [patch 1/2] fs: cleanup files_lock Nick Piggin
@ 2010-03-16 9:46 ` Nick Piggin
2010-03-16 14:41 ` [patch 1/2] fs: cleanup files_lock Andi Kleen
2010-03-17 14:16 ` Greg KH
2 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2010-03-16 9:46 UTC (permalink / raw)
To: Al Viro, Frank Mayhar, John Stultz, Andi Kleen, linux-fsdevel
Cc: Greg Kroah-Hartman, Alan Cox, Eric W. Biederman, Linus Torvalds
fs: scale files_lock
Improve scalability of files_lock by adding per-cpu, per-sb files lists,
protected with per-cpu locking. Effectively turning it into a big-writer lock.
One difficulty with this approach is that a file can be removed from the list
by another CPU. We must track which per-cpu list the file is on. Scalability
could suffer if files are frequently removed from different cpu's list.
However loads with frequent removal of files imply short interval between
adding and removing the files, and the scheduler attempts to avoid moving
processes too far away. Also, even in the case of cross-CPU removal, the
hardware has much more opportunity to parallelise cacheline transfers with N
cachelines than with 1.
A worst-case test of 1 CPU allocating files subsequently being freed by N CPUs
degenerates to contending on a single lock, which is no worse than before. When
more than one CPU are allocating files, even if they are always freed by
different CPUs, there will be more parallelism than the single-lock case.
Testing results on a 2 socket, 8 core opteron, I measure the number of times
the lock is taken to remove the file, the number of times it is removed by the
same CPU that added it, and the number of times it is removed by the same node
that added it.
Booting:
locks=25049 cpu-hits=23174 (92.5%) node-hits=23945 (95.6%)
kbuild -j16
locks=2281913 cpu-hits=2208126 (96.8%) node-hits=2252674 (98.7%)
dbench 64
locks=4306582 cpu-hits=4287247 (99.6%) node-hits=4299527 (99.8%)
Signed-off-by: Nick Piggin <npiggin@suse.de>
---
fs/file_table.c | 155 ++++++++++++++++++++++++++++++++++++++---------------
fs/super.c | 18 ++++++
include/linux/fs.h | 7 ++
3 files changed, 139 insertions(+), 41 deletions(-)
Index: linux-2.6/fs/file_table.c
===================================================================
--- linux-2.6.orig/fs/file_table.c
+++ linux-2.6/fs/file_table.c
@@ -21,6 +21,7 @@
#include <linux/fsnotify.h>
#include <linux/sysctl.h>
#include <linux/percpu_counter.h>
+#include <linux/percpu.h>
#include <linux/ima.h>
#include <asm/atomic.h>
@@ -32,7 +33,7 @@ struct files_stat_struct files_stat = {
.max_files = NR_FILE
};
-static __cacheline_aligned_in_smp DEFINE_SPINLOCK(files_lock);
+static DEFINE_PER_CPU(spinlock_t, files_cpulock);
/* SLAB cache for file structures */
static struct kmem_cache *filp_cachep __read_mostly;
@@ -330,42 +331,101 @@ void put_filp(struct file *file)
void file_sb_list_add(struct file *file, struct super_block *sb)
{
- spin_lock(&files_lock);
+ spinlock_t *lock;
+ struct list_head *list;
+#ifdef CONFIG_SMP
+ int cpu;
+#endif
+
+ lock = &get_cpu_var(files_cpulock);
+#ifdef CONFIG_SMP
+ cpu = smp_processor_id();
+ list = per_cpu_ptr(sb->s_files, cpu);
+ file->f_sb_list_cpu = cpu;
+#else
+ list = &sb->s_files;
+#endif
+ spin_lock(lock);
BUG_ON(!list_empty(&file->f_u.fu_list));
- list_add(&file->f_u.fu_list, &sb->s_files);
- spin_unlock(&files_lock);
+ list_add(&file->f_u.fu_list, list);
+ spin_unlock(lock);
+ put_cpu_var(files_cpulock);
}
void file_sb_list_del(struct file *file)
{
if (!list_empty(&file->f_u.fu_list)) {
- spin_lock(&files_lock);
+ spinlock_t *lock;
+
+#ifdef CONFIG_SMP
+ lock = &per_cpu(files_cpulock, file->f_sb_list_cpu);
+#else
+ lock = &__get_cpu_var(files_cpulock);
+#endif
+ spin_lock(lock);
list_del_init(&file->f_u.fu_list);
- spin_unlock(&files_lock);
+ spin_unlock(lock);
+ }
+}
+
+static void file_list_lock_all(void)
+{
+ int i;
+ int nr = 0;
+
+ for_each_possible_cpu(i) {
+ spinlock_t *lock;
+
+ lock = &per_cpu(files_cpulock, i);
+ spin_lock_nested(lock, nr);
+ nr++;
+ }
+}
+
+static void file_list_unlock_all(void)
+{
+ int i;
+
+ for_each_possible_cpu(i) {
+ spinlock_t *lock;
+
+ lock = &per_cpu(files_cpulock, i);
+ spin_unlock(lock);
}
}
int fs_may_remount_ro(struct super_block *sb)
{
- struct file *file;
+ int i;
/* Check that no files are currently opened for writing. */
- spin_lock(&files_lock);
- list_for_each_entry(file, &sb->s_files, f_u.fu_list) {
- struct inode *inode = file->f_path.dentry->d_inode;
-
- /* File with pending delete? */
- if (inode->i_nlink == 0)
- goto too_bad;
-
- /* Writeable file? */
- if (S_ISREG(inode->i_mode) && (file->f_mode & FMODE_WRITE))
- goto too_bad;
+ file_list_lock_all();
+ for_each_possible_cpu(i) {
+ struct file *file;
+ struct list_head *list;
+
+#ifdef CONFIG_SMP
+ list = per_cpu_ptr(sb->s_files, i);
+#else
+ list = &sb->s_files;
+#endif
+ list_for_each_entry(file, list, f_u.fu_list) {
+ struct inode *inode = file->f_path.dentry->d_inode;
+
+ /* File with pending delete? */
+ if (inode->i_nlink == 0)
+ goto too_bad;
+
+ /* Writeable file? */
+ if (S_ISREG(inode->i_mode) &&
+ (file->f_mode & FMODE_WRITE))
+ goto too_bad;
+ }
}
- spin_unlock(&files_lock);
+ file_list_unlock_all();
return 1; /* Tis' cool bro. */
too_bad:
- spin_unlock(&files_lock);
+ file_list_unlock_all();
return 0;
}
@@ -378,37 +438,48 @@ too_bad:
*/
void mark_files_ro(struct super_block *sb)
{
- struct file *f;
+ int i;
retry:
- spin_lock(&files_lock);
- list_for_each_entry(f, &sb->s_files, f_u.fu_list) {
- struct vfsmount *mnt;
- if (!S_ISREG(f->f_path.dentry->d_inode->i_mode))
- continue;
- if (!file_count(f))
- continue;
- if (!(f->f_mode & FMODE_WRITE))
- continue;
- spin_lock(&f->f_lock);
- f->f_mode &= ~FMODE_WRITE;
- spin_unlock(&f->f_lock);
- if (file_check_writeable(f) != 0)
- continue;
- file_release_write(f);
- mnt = mntget(f->f_path.mnt);
- /* This can sleep, so we can't hold the spinlock. */
- spin_unlock(&files_lock);
- mnt_drop_write(mnt);
- mntput(mnt);
- goto retry;
+ file_list_lock_all();
+ for_each_possible_cpu(i) {
+ struct file *f;
+ struct list_head *list;
+
+#ifdef CONFIG_SMP
+ list = per_cpu_ptr(sb->s_files, i);
+#else
+ list = &sb->s_files;
+#endif
+ list_for_each_entry(f, list, f_u.fu_list) {
+ struct vfsmount *mnt;
+ if (!S_ISREG(f->f_path.dentry->d_inode->i_mode))
+ continue;
+ if (!file_count(f))
+ continue;
+ if (!(f->f_mode & FMODE_WRITE))
+ continue;
+ spin_lock(&f->f_lock);
+ f->f_mode &= ~FMODE_WRITE;
+ spin_unlock(&f->f_lock);
+ if (file_check_writeable(f) != 0)
+ continue;
+ file_release_write(f);
+ mnt = mntget(f->f_path.mnt);
+ /* This can sleep, so we can't hold the spinlock. */
+ file_list_unlock_all();
+ mnt_drop_write(mnt);
+ mntput(mnt);
+ goto retry;
+ }
}
- spin_unlock(&files_lock);
+ file_list_unlock_all();
}
void __init files_init(unsigned long mempages)
{
int n;
+ int i;
filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
@@ -423,5 +494,7 @@ void __init files_init(unsigned long mem
if (files_stat.max_files < NR_FILE)
files_stat.max_files = NR_FILE;
files_defer_init();
+ for_each_possible_cpu(i)
+ spin_lock_init(&per_cpu(files_cpulock, i));
percpu_counter_init(&nr_files, 0);
}
Index: linux-2.6/fs/super.c
===================================================================
--- linux-2.6.orig/fs/super.c
+++ linux-2.6/fs/super.c
@@ -62,7 +62,22 @@ static struct super_block *alloc_super(s
s = NULL;
goto out;
}
+#ifdef CONFIG_SMP
+ s->s_files = alloc_percpu(struct list_head);
+ if (!s->s_files) {
+ security_sb_free(s);
+ kfree(s);
+ s = NULL;
+ goto out;
+ } else {
+ int i;
+
+ for_each_possible_cpu(i)
+ INIT_LIST_HEAD(per_cpu_ptr(s->s_files, i));
+ }
+#else
INIT_LIST_HEAD(&s->s_files);
+#endif
INIT_LIST_HEAD(&s->s_instances);
INIT_HLIST_HEAD(&s->s_anon);
INIT_LIST_HEAD(&s->s_inodes);
@@ -117,6 +132,9 @@ out:
*/
static inline void destroy_super(struct super_block *s)
{
+#ifdef CONFIG_SMP
+ free_percpu(s->s_files);
+#endif
security_sb_free(s);
kfree(s->s_subtype);
kfree(s->s_options);
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h
+++ linux-2.6/include/linux/fs.h
@@ -924,6 +924,9 @@ struct file {
#define f_vfsmnt f_path.mnt
const struct file_operations *f_op;
spinlock_t f_lock; /* f_ep_links, f_flags, no IRQ */
+#ifdef CONFIG_SMP
+ int f_sb_list_cpu;
+#endif
atomic_long_t f_count;
unsigned int f_flags;
fmode_t f_mode;
@@ -1340,7 +1343,11 @@ struct super_block {
struct list_head s_inodes; /* all inodes */
struct hlist_head s_anon; /* anonymous dentries for (nfs) exporting */
+#ifdef CONFIG_SMP
+ struct list_head *s_files;
+#else
struct list_head s_files;
+#endif
/* s_dentry_lru and s_nr_dentry_unused are protected by dcache_lock */
struct list_head s_dentry_lru; /* unused dentry lru */
int s_nr_dentry_unused; /* # of dentry on lru */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 1/2] fs: cleanup files_lock
2010-03-16 9:44 [patch 1/2] fs: cleanup files_lock Nick Piggin
2010-03-16 9:46 ` [patch 2/2] fs: scale files_lock Nick Piggin
@ 2010-03-16 14:41 ` Andi Kleen
2010-03-17 14:16 ` Greg KH
2 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2010-03-16 14:41 UTC (permalink / raw)
To: Nick Piggin
Cc: Al Viro, Frank Mayhar, John Stultz, linux-fsdevel,
Greg Kroah-Hartman, Alan Cox, Eric W. Biederman, Linus Torvalds
, Nick Piggin wrote:
> I would like to start sending bits of vfs scalability improvements to
> be reviewed and hopefully merged.
>
> I will start with files_lock. Last time this one came up, it was
> criticised because some hoped to get rid of files list, and because
> it didn't have enough justification of the scalability improvement.
>
> For the first criticism, it isn't any more difficult to rip out if
> we are ever able to remove files list. For the second, I have gathered
> some statistics and written better justification. Andi I believe is
> finding kbuild is becoming limited by files lock on larger systems.
Yes we're seeing files_list lock as the major bottleneck on a 32 CPU Threads
kernel build.
I did a quick review of the patch and it looks good to me.
Acked-by: Andi Kleen <ak@linux.intel.com>
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 1/2] fs: cleanup files_lock
2010-03-16 9:44 [patch 1/2] fs: cleanup files_lock Nick Piggin
2010-03-16 9:46 ` [patch 2/2] fs: scale files_lock Nick Piggin
2010-03-16 14:41 ` [patch 1/2] fs: cleanup files_lock Andi Kleen
@ 2010-03-17 14:16 ` Greg KH
2010-03-17 14:38 ` Nick Piggin
2 siblings, 1 reply; 5+ messages in thread
From: Greg KH @ 2010-03-17 14:16 UTC (permalink / raw)
To: Nick Piggin
Cc: Al Viro, Frank Mayhar, John Stultz, Andi Kleen, linux-fsdevel,
Alan Cox, Eric W. Biederman, Linus Torvalds
On Tue, Mar 16, 2010 at 08:44:23PM +1100, Nick Piggin wrote:
> I would like to start sending bits of vfs scalability improvements to
> be reviewed and hopefully merged.
>
> I will start with files_lock. Last time this one came up, it was
> criticised because some hoped to get rid of files list, and because
> it didn't have enough justification of the scalability improvement.
>
> For the first criticism, it isn't any more difficult to rip out if
> we are ever able to remove files list. For the second, I have gathered
> some statistics and written better justification. Andi I believe is
> finding kbuild is becoming limited by files lock on larger systems.
>
> --
>
> fs: cleanup files_lock
>
> Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
> manipulate the per-sb files list; unexport the files_lock spinlock.
>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
Looks good to me. Do you want me to apply this to my tty tree, or do
you want to take it through your tree because your other patch depends
on this?
If you want to do it, feel free to add:
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
to the patch.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 1/2] fs: cleanup files_lock
2010-03-17 14:16 ` Greg KH
@ 2010-03-17 14:38 ` Nick Piggin
0 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2010-03-17 14:38 UTC (permalink / raw)
To: Greg KH
Cc: Al Viro, Frank Mayhar, John Stultz, Andi Kleen, linux-fsdevel,
Alan Cox, Eric W. Biederman, Linus Torvalds
On Wed, Mar 17, 2010 at 07:16:02AM -0700, Greg KH wrote:
> On Tue, Mar 16, 2010 at 08:44:23PM +1100, Nick Piggin wrote:
> > I would like to start sending bits of vfs scalability improvements to
> > be reviewed and hopefully merged.
> >
> > I will start with files_lock. Last time this one came up, it was
> > criticised because some hoped to get rid of files list, and because
> > it didn't have enough justification of the scalability improvement.
> >
> > For the first criticism, it isn't any more difficult to rip out if
> > we are ever able to remove files list. For the second, I have gathered
> > some statistics and written better justification. Andi I believe is
> > finding kbuild is becoming limited by files lock on larger systems.
> >
> > --
> >
> > fs: cleanup files_lock
> >
> > Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
> > manipulate the per-sb files list; unexport the files_lock spinlock.
> >
> > Signed-off-by: Nick Piggin <npiggin@suse.de>
>
> Looks good to me.
So long as the tty file always gets removed before file_sb_list_del
in __fput or put_filp. Otherwise file_sb_list_del would use the wrong
lock for it. As far as I could see, it should always get removed by
tty_release.
> Do you want me to apply this to my tty tree, or do
> you want to take it through your tree because your other patch depends
> on this?
>
> If you want to do it, feel free to add:
> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
> to the patch.
If they aren't both applied to -vfs tree in this round I'll ask you to
apply this one.
Thanks,
Nick
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-03-17 14:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-16 9:44 [patch 1/2] fs: cleanup files_lock Nick Piggin
2010-03-16 9:46 ` [patch 2/2] fs: scale files_lock Nick Piggin
2010-03-16 14:41 ` [patch 1/2] fs: cleanup files_lock Andi Kleen
2010-03-17 14:16 ` Greg KH
2010-03-17 14:38 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).