From: Anton Blanchard <anton@samba.org>
To: npiggin@kernel.dk, viro@zeniv.linux.org.uk
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: vfsmount lock issues on very large ppc64 box
Date: Sun, 17 Jul 2011 10:50:27 +1000 [thread overview]
Message-ID: <20110717105027.53cc3ca4@kryten> (raw)
When compiling a kernel with make -j on a ppc64 box with 896
HW threads we spend a very large amount of time in vfsmount
lock code:
20.85% [k] .vfsmount_lock_local_lock
|
|--57.20%-- .mntput_no_expire
| |
| |--74.93%-- .fput
| | |
| | |--91.19%-- .filp_close
| | | |
| | | |--98.97%-- .sys_close
14.15% [k] .vfsmount_lock_global_lock_online
|
|--100.00%-- .mntput_no_expire
| .fput
| .filp_close
| |
| |--70.01%-- .put_files_struct
| | .do_exit
| | .do_group_exit
| | .sys_exit_group
| | syscall_exit
| |
| --29.99%-- .sys_close
Looking closer, all of these calls are in pipefs and sockfs.
Since we never mount either filesystem they never get a long term
reference and we always end up in the very slow write brlock path
that takes a lock for each online CPU.
Here is a quick hack that takes a long term reference on pipefs
and sockfs which fixes the problem. Any thoughts on how we should
fix it properly?
---
Signed-off-by: Anton Blanchard <anton@samba.org>
Index: linux-2.6-work/fs/pipe.c
===================================================================
--- linux-2.6-work.orig/fs/pipe.c 2011-07-17 10:21:54.695472158 +1000
+++ linux-2.6-work/fs/pipe.c 2011-07-17 10:33:31.127204731 +1000
@@ -20,6 +20,7 @@
#include <linux/audit.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
+#include "internal.h"
#include <asm/uaccess.h>
#include <asm/ioctls.h>
@@ -1286,11 +1287,13 @@ static int __init init_pipe_fs(void)
unregister_filesystem(&pipe_fs_type);
}
}
+ mnt_make_longterm(pipe_mnt);
return err;
}
static void __exit exit_pipe_fs(void)
{
+ mnt_make_shortterm(pipe_mnt);
unregister_filesystem(&pipe_fs_type);
mntput(pipe_mnt);
}
Index: linux-2.6-work/net/socket.c
===================================================================
--- linux-2.6-work.orig/net/socket.c 2011-07-17 10:21:54.685471989 +1000
+++ linux-2.6-work/net/socket.c 2011-07-17 10:33:41.247375257 +1000
@@ -2500,6 +2500,8 @@ void sock_unregister(int family)
}
EXPORT_SYMBOL(sock_unregister);
+extern void mnt_make_longterm(struct vfsmount *);
+
static int __init sock_init(void)
{
int err;
@@ -2530,6 +2532,8 @@ static int __init sock_init(void)
goto out_mount;
}
+ mnt_make_longterm(sock_mnt);
+
/* The real protocol initialization is performed in later initcalls.
*/
next reply other threads:[~2011-07-17 0:50 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-17 0:50 Anton Blanchard [this message]
2011-07-17 1:04 ` vfsmount lock issues on very large ppc64 box Matthew Wilcox
2011-07-17 8:46 ` Andi Kleen
2011-07-18 8:17 ` Eric Dumazet
2011-07-18 15:40 ` Christoph Hellwig
2011-07-18 15:51 ` Al Viro
2011-07-19 16:32 ` [Patch] VFS : mount lock scalability for files systems without mount point (WAS vfsmount lock issues on very large ppc64 box) Tim Chen
2011-07-21 20:40 ` Al Viro
2011-07-22 0:27 ` Tim Chen
2011-07-23 13:24 ` Christoph Hellwig
2011-07-25 22:39 ` Tim Chen
2011-07-25 22:51 ` Al Viro
2011-07-25 23:22 ` Tim Chen
2011-07-26 6:00 ` Eric Dumazet
2011-07-26 8:21 ` [PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list Eric Dumazet
2011-07-26 9:03 ` Christoph Hellwig
2011-07-26 9:36 ` Eric Dumazet
2011-07-26 9:42 ` Christoph Hellwig
2011-07-26 10:43 ` Eric Dumazet
2011-07-26 11:49 ` Christoph Hellwig
2011-07-27 15:21 ` [PATCH] vfs: avoid taking locks if inode not in lists Eric Dumazet
2011-07-27 17:12 ` Andi Kleen
2011-07-27 20:44 ` Christoph Hellwig
2011-07-27 20:59 ` Andi Kleen
2011-07-27 21:01 ` Christoph Hellwig
2011-07-28 4:11 ` [PATCH] vfs: conditionally call inode_wb_list_del() Eric Dumazet
2011-07-28 4:41 ` [PATCH] vfs: avoid taking locks if inode not in lists Eric Dumazet
2011-07-28 4:55 ` [PATCH] vfs: avoid call to inode_lru_list_del() if possible Eric Dumazet
2011-07-18 16:41 ` vfsmount lock issues on very large ppc64 box Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110717105027.53cc3ca4@kryten \
--to=anton@samba.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@kernel.dk \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).