From: Anton Blanchard <anton@samba.org>
To: npiggin@kernel.dk, viro@zeniv.linux.org.uk
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: vfsmount lock issues on very large ppc64 box
Date: Sun, 17 Jul 2011 10:50:27 +1000 [thread overview]
Message-ID: <20110717105027.53cc3ca4@kryten> (raw)
When compiling a kernel with make -j on a ppc64 box with 896
HW threads we spend a very large amount of time in vfsmount
lock code:
20.85% [k] .vfsmount_lock_local_lock
|
|--57.20%-- .mntput_no_expire
| |
| |--74.93%-- .fput
| | |
| | |--91.19%-- .filp_close
| | | |
| | | |--98.97%-- .sys_close
14.15% [k] .vfsmount_lock_global_lock_online
|
|--100.00%-- .mntput_no_expire
| .fput
| .filp_close
| |
| |--70.01%-- .put_files_struct
| | .do_exit
| | .do_group_exit
| | .sys_exit_group
| | syscall_exit
| |
| --29.99%-- .sys_close
Looking closer, all of these calls are in pipefs and sockfs.
Since we never mount either filesystem they never get a long term
reference and we always end up in the very slow write brlock path
that takes a lock for each online CPU.
Here is a quick hack that takes a long term reference on pipefs
and sockfs which fixes the problem. Any thoughts on how we should
fix it properly?
---
Signed-off-by: Anton Blanchard <anton@samba.org>
Index: linux-2.6-work/fs/pipe.c
===================================================================
--- linux-2.6-work.orig/fs/pipe.c 2011-07-17 10:21:54.695472158 +1000
+++ linux-2.6-work/fs/pipe.c 2011-07-17 10:33:31.127204731 +1000
@@ -20,6 +20,7 @@
#include <linux/audit.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
+#include "internal.h"
#include <asm/uaccess.h>
#include <asm/ioctls.h>
@@ -1286,11 +1287,13 @@ static int __init init_pipe_fs(void)
unregister_filesystem(&pipe_fs_type);
}
}
+ mnt_make_longterm(pipe_mnt);
return err;
}
static void __exit exit_pipe_fs(void)
{
+ mnt_make_shortterm(pipe_mnt);
unregister_filesystem(&pipe_fs_type);
mntput(pipe_mnt);
}
Index: linux-2.6-work/net/socket.c
===================================================================
--- linux-2.6-work.orig/net/socket.c 2011-07-17 10:21:54.685471989 +1000
+++ linux-2.6-work/net/socket.c 2011-07-17 10:33:41.247375257 +1000
@@ -2500,6 +2500,8 @@ void sock_unregister(int family)
}
EXPORT_SYMBOL(sock_unregister);
+extern void mnt_make_longterm(struct vfsmount *);
+
static int __init sock_init(void)
{
int err;
@@ -2530,6 +2532,8 @@ static int __init sock_init(void)
goto out_mount;
}
+ mnt_make_longterm(sock_mnt);
+
/* The real protocol initialization is performed in later initcalls.
*/
next reply other threads:[~2011-07-17 0:50 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-17 0:50 Anton Blanchard [this message]
2011-07-17 1:04 ` vfsmount lock issues on very large ppc64 box Matthew Wilcox
2011-07-17 8:46 ` Andi Kleen
2011-07-18 8:17 ` Eric Dumazet
2011-07-18 15:40 ` Christoph Hellwig
2011-07-18 15:51 ` Al Viro
2011-07-19 16:32 ` [Patch] VFS : mount lock scalability for files systems without mount point (WAS vfsmount lock issues on very large ppc64 box) Tim Chen
2011-07-21 20:40 ` Al Viro
2011-07-22 0:27 ` Tim Chen
2011-07-23 13:24 ` Christoph Hellwig
2011-07-25 22:39 ` Tim Chen
2011-07-25 22:51 ` Al Viro
2011-07-25 23:22 ` Tim Chen
2011-07-26 6:00 ` Eric Dumazet
2011-07-26 8:21 ` [PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list Eric Dumazet
2011-07-26 8:21 ` Eric Dumazet
2011-07-26 9:03 ` Christoph Hellwig
2011-07-26 9:36 ` Eric Dumazet
2011-07-26 9:42 ` Christoph Hellwig
2011-07-26 10:43 ` Eric Dumazet
2011-07-26 10:43 ` Eric Dumazet
2011-07-26 11:49 ` Christoph Hellwig
2011-07-27 15:21 ` [PATCH] vfs: avoid taking locks if inode not in lists Eric Dumazet
2011-07-27 15:21 ` Eric Dumazet
2011-07-27 17:12 ` Andi Kleen
2011-07-27 20:44 ` Christoph Hellwig
2011-07-27 20:59 ` Andi Kleen
2011-07-27 21:01 ` Christoph Hellwig
2011-07-28 4:11 ` [PATCH] vfs: conditionally call inode_wb_list_del() Eric Dumazet
2011-07-28 4:11 ` Eric Dumazet
2011-07-28 4:41 ` [PATCH] vfs: avoid taking locks if inode not in lists Eric Dumazet
2011-07-28 4:41 ` Eric Dumazet
2011-07-28 4:55 ` [PATCH] vfs: avoid call to inode_lru_list_del() if possible Eric Dumazet
2011-07-28 4:55 ` Eric Dumazet
2011-07-18 16:41 ` vfsmount lock issues on very large ppc64 box Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110717105027.53cc3ca4@kryten \
--to=anton@samba.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@kernel.dk \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.