linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: scalability investigation: Where can I get your latest patches?
       [not found] ` <20100720031201.GC21274@amd>
@ 2010-08-04  1:04   ` Zhang, Yanmin
  2010-08-04  7:21     ` Kleen, Andi
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-04  1:04 UTC (permalink / raw)
  To: Nick Piggin; +Cc: andi.kleen, alexs.shi, linux-mm, linux-fsdevel

On Tue, 2010-07-20 at 13:12 +1000, Nick Piggin wrote:
> On Thu, Jul 08, 2010 at 04:56:27PM +0800, Zhang, Yanmin wrote:
> > Nick,
> > 
> > I work with Andi Kleen and Tim to investigate some scalability issues.
> > 
> > Andi gave me a pointer at:
> > http://thread.gmane.org/gmane.linux.kernel/1002380/focus=42284
> > 
> > Where can I get your latest patches? It's better if I could get patch tarball.
> > 
> > Thanks,
> > Yanmin
> > 
> 
> Hi Yanmin,
> 
> Sorry for the delay. I have a git tree now, and it has been through
> some tress testing.
> 
> http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git
> 
> I would be very interested to know if you encounter problems or are
> able to generate any benchmark numbers.
Nick,

We ran lots of benchmarks on many machines. Below is something to share with you.

Improvement:
1) We get about 30% improvement with kbuild workload on Nehalem machines. It's hard
to improve kbuild performance. Your tree does.

Issues:
1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.
2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
a machine. Start 4 dbench threads per disk.
We run the workload under a regular user account. If we run it under root account,
we get 22% improvement instead of regression.
The root cause is ACL checking. With your patch, do_path_lookup firstly goes through
rcu steps which including a exec permission checking. With ACL, the __exec_permission
always fails. Then a later nameidata_drop_rcu often fails as dentry->d_seq is changed.

With root account, it doesn't happen. We mount the working devices under /mnt/stp/XXX.
/mnt is of root user. So the exec permission check is ok.

I remount all file systems on the testing path with noacl option, and get the similar
results like under root account.

3) aim7 has about 40% regression on Nehalem EX 4-socket machine. The root cause is the
same thing like 2).

Other benchmarks' results have no improvement or regression.

Yanmin



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: scalability investigation: Where can I get your latest patches?
  2010-08-04  1:04   ` scalability investigation: Where can I get your latest patches? Zhang, Yanmin
@ 2010-08-04  7:21     ` Kleen, Andi
  2010-08-04  7:58       ` Zhang, Yanmin
  2010-08-05 10:55     ` Nick Piggin
  2010-08-05 11:44     ` Nick Piggin
  2 siblings, 1 reply; 11+ messages in thread
From: Kleen, Andi @ 2010-08-04  7:21 UTC (permalink / raw)
  To: Zhang, Yanmin, Nick Piggin
  Cc: alexs.shi@intel.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org

> Issues:
> 1) Compiling fails on a couple of file systems, such like
> CONFIG_ISO9660_FS=y.
> 2) dbenchthreads has about 50% regression. We connect a JBOD of 12
> disks to
> a machine. Start 4 dbench threads per disk.
> We run the workload under a regular user account. If we run it under
> root account,
> we get 22% improvement instead of regression.
> The root cause is ACL checking. With your patch, do_path_lookup firstly
> goes through
> rcu steps which including a exec permission checking. With ACL, the
> __exec_permission
> always fails. Then a later nameidata_drop_rcu often fails as dentry-
> >d_seq is changed.

I believe the latest version of Nick's patchkit has a likely fix for that.

http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git;a=commitdiff;h=9edd35f9aeafc8a5e1688b84cf4488a94898ca45

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: scalability investigation: Where can I get your latest patches?
  2010-08-04  7:21     ` Kleen, Andi
@ 2010-08-04  7:58       ` Zhang, Yanmin
  2010-08-04  8:06         ` Kleen, Andi
  0 siblings, 1 reply; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-04  7:58 UTC (permalink / raw)
  To: Kleen, Andi
  Cc: Nick Piggin, alexs.shi@intel.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org

On Wed, 2010-08-04 at 08:21 +0100, Kleen, Andi wrote:
> > Issues:
> > 1) Compiling fails on a couple of file systems, such like
> > CONFIG_ISO9660_FS=y.
> > 2) dbenchthreads has about 50% regression. We connect a JBOD of 12
> > disks to
> > a machine. Start 4 dbench threads per disk.
> > We run the workload under a regular user account. If we run it under
> > root account,
> > we get 22% improvement instead of regression.
> > The root cause is ACL checking. With your patch, do_path_lookup firstly
> > goes through
> > rcu steps which including a exec permission checking. With ACL, the
> > __exec_permission
> > always fails. Then a later nameidata_drop_rcu often fails as dentry-
> > >d_seq is changed.
> 
> I believe the latest version of Nick's patchkit has a likely fix for that.
> 
> http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git;a=commitdiff;h=9edd35f9aeafc8a5e1688b84cf4488a94898ca45

Thanks Andi. The patch has no ext3 part.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: scalability investigation: Where can I get your latest patches?
  2010-08-04  7:58       ` Zhang, Yanmin
@ 2010-08-04  8:06         ` Kleen, Andi
  2010-08-04  8:50           ` Zhang, Yanmin
  0 siblings, 1 reply; 11+ messages in thread
From: Kleen, Andi @ 2010-08-04  8:06 UTC (permalink / raw)
  To: Zhang, Yanmin
  Cc: Nick Piggin, Shi, Alex, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org

> > I believe the latest version of Nick's patchkit has a likely fix for
> that.
> >
> > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-
> npiggin.git;a=commitdiff;h=9edd35f9aeafc8a5e1688b84cf4488a94898ca45
> 
> Thanks Andi. The patch has no ext3 part.

Good point. But perhaps the ext2 patch can be adapted. The ACL code
should be similar in ext2 and ext3 (and 4)

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: scalability investigation: Where can I get your latest patches?
  2010-08-04  8:06         ` Kleen, Andi
@ 2010-08-04  8:50           ` Zhang, Yanmin
  2010-08-05 10:57             ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-04  8:50 UTC (permalink / raw)
  To: Kleen, Andi
  Cc: Nick Piggin, Shi, Alex, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org

On Wed, 2010-08-04 at 09:06 +0100, Kleen, Andi wrote:
> > > I believe the latest version of Nick's patchkit has a likely fix for
> > that.
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-
> > npiggin.git;a=commitdiff;h=9edd35f9aeafc8a5e1688b84cf4488a94898ca45
> > 
> > Thanks Andi. The patch has no ext3 part.
> 
> Good point. But perhaps the ext2 patch can be adapted. The ACL code
> should be similar in ext2 and ext3 (and 4)
I ported ext2 part to ext3. aim7 testing on Nehalem EX 4 socket machine
shows the regression disappears.

---

diff -Nraup linux-2.6.35-rc5_nick/fs/ext3/acl.c linux-2.6.35-rc5_npymz/fs/ext3/acl.c
--- linux-2.6.35-rc5_nick/fs/ext3/acl.c	2010-08-05 16:23:19.000000000 +0800
+++ linux-2.6.35-rc5_npymz/fs/ext3/acl.c	2010-08-05 15:47:38.000000000 +0800
@@ -240,13 +240,21 @@ ext3_set_acl(handle_t *handle, struct in
 }
 
 int
-ext3_check_acl(struct inode *inode, int mask)
+ext3_check_acl_rcu(struct inode *inode, int mask, unsigned int flags)
 {
-	struct posix_acl *acl = ext3_get_acl(inode, ACL_TYPE_ACCESS);
+	struct posix_acl *acl;
 
-	if (IS_ERR(acl))
-		return PTR_ERR(acl);
-	if (acl) {
+	if (flags & IPERM_FLAG_RCU) {
+		if (!negative_cached_acl(inode, ACL_TYPE_ACCESS))
+			return -ECHILD;
+		return -EAGAIN;
+	}
+
+       acl = ext3_get_acl(inode, ACL_TYPE_ACCESS);
+       if (IS_ERR(acl))
+                return PTR_ERR(acl);
+
+        if (acl) {
 		int error = posix_acl_permission(inode, acl, mask);
 		posix_acl_release(acl);
 		return error;
diff -Nraup linux-2.6.35-rc5_nick/fs/ext3/acl.h linux-2.6.35-rc5_npymz/fs/ext3/acl.h
--- linux-2.6.35-rc5_nick/fs/ext3/acl.h	2010-08-05 16:23:19.000000000 +0800
+++ linux-2.6.35-rc5_npymz/fs/ext3/acl.h	2010-08-05 15:48:51.000000000 +0800
@@ -54,7 +54,7 @@ static inline int ext3_acl_count(size_t 
 #ifdef CONFIG_EXT3_FS_POSIX_ACL
 
 /* acl.c */
-extern int ext3_check_acl (struct inode *, int);
+extern int ext3_check_acl_rcu(struct inode *inode, int mask, unsigned int flags);
 extern int ext3_acl_chmod (struct inode *);
 extern int ext3_init_acl (handle_t *, struct inode *, struct inode *);
 
diff -Nraup linux-2.6.35-rc5_nick/fs/ext3/file.c linux-2.6.35-rc5_npymz/fs/ext3/file.c
--- linux-2.6.35-rc5_nick/fs/ext3/file.c	2010-08-05 16:23:19.000000000 +0800
+++ linux-2.6.35-rc5_npymz/fs/ext3/file.c	2010-08-05 15:52:39.000000000 +0800
@@ -79,7 +79,7 @@ const struct inode_operations ext3_file_
 	.listxattr	= ext3_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.check_acl	= ext3_check_acl,
+	.check_acl_rcu	= ext3_check_acl_rcu,
 	.fiemap		= ext3_fiemap,
 };
 
diff -Nraup linux-2.6.35-rc5_nick/fs/ext3/namei.c linux-2.6.35-rc5_npymz/fs/ext3/namei.c
--- linux-2.6.35-rc5_nick/fs/ext3/namei.c	2010-08-05 16:25:08.000000000 +0800
+++ linux-2.6.35-rc5_npymz/fs/ext3/namei.c	2010-08-05 16:01:47.000000000 +0800
@@ -2465,7 +2465,7 @@ const struct inode_operations ext3_dir_i
 	.listxattr	= ext3_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.check_acl	= ext3_check_acl,
+	.check_acl_rcu	= ext3_check_acl_rcu,
 };
 
 const struct inode_operations ext3_special_inode_operations = {
@@ -2476,5 +2476,5 @@ const struct inode_operations ext3_speci
 	.listxattr	= ext3_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.check_acl	= ext3_check_acl,
+	.check_acl_rcu	= ext3_check_acl_rcu,
 };



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-04  1:04   ` scalability investigation: Where can I get your latest patches? Zhang, Yanmin
  2010-08-04  7:21     ` Kleen, Andi
@ 2010-08-05 10:55     ` Nick Piggin
  2010-08-09  2:11       ` Zhang, Yanmin
  2010-08-09  3:20       ` Zhang, Yanmin
  2010-08-05 11:44     ` Nick Piggin
  2 siblings, 2 replies; 11+ messages in thread
From: Nick Piggin @ 2010-08-05 10:55 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Nick Piggin, andi.kleen, alexs.shi, linux-mm, linux-fsdevel

On Wed, Aug 04, 2010 at 09:04:03AM +0800, Zhang, Yanmin wrote:
> On Tue, 2010-07-20 at 13:12 +1000, Nick Piggin wrote:
> > On Thu, Jul 08, 2010 at 04:56:27PM +0800, Zhang, Yanmin wrote:
> > > Nick,
> > > 
> > > I work with Andi Kleen and Tim to investigate some scalability issues.
> > > 
> > > Andi gave me a pointer at:
> > > http://thread.gmane.org/gmane.linux.kernel/1002380/focus=42284
> > > 
> > > Where can I get your latest patches? It's better if I could get patch tarball.
> > > 
> > > Thanks,
> > > Yanmin
> > > 
> > 
> > Hi Yanmin,
> > 
> > Sorry for the delay. I have a git tree now, and it has been through
> > some tress testing.
> > 
> > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git
> > 
> > I would be very interested to know if you encounter problems or are
> > able to generate any benchmark numbers.
> Nick,
> 
> We ran lots of benchmarks on many machines. Below is something to
> share with you.

Great, thanks for doing this!

 
> Improvement:
> 1) We get about 30% improvement with kbuild workload on Nehalem
> machines. It's hard to improve kbuild performance. Your tree does.

Well that's nice. What size of machine is this? Did you run it on an
ACL enabled filesystem?


> Issues:
> 1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.

Yes there are a couple that broke, which I still need to fix up.


> 2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
> a machine. Start 4 dbench threads per disk.  We run the workload under
> a regular user account. If we run it under root account, we get 22%
> improvement instead of regression.  The root cause is ACL checking.
> With your patch, do_path_lookup firstly goes through rcu steps which
> including a exec permission checking. With ACL, the __exec_permission
> always fails. Then a later nameidata_drop_rcu often fails as
> dentry->d_seq is changed.
> 
> With root account, it doesn't happen. We mount the working devices
> under /mnt/stp/XXX.  /mnt is of root user. So the exec permission
> check is ok.

Yes if running with root, this should have the same effect as the
rcu-walk aware ACL patch. BTW. dbench has a nasty call to statvfs()
which is a huge cost (which should be fixed in future versions of
kernel+glibc). You can try switching the statvfs(2) call in fileio.c
to statfs(2) and see if performance improves.

Are you disk bound or CPU bound at this point?

> I remount all file systems on the testing path with noacl option, and
> get the similar results like under root account.
> 
> 3) aim7 has about 40% regression on Nehalem EX 4-socket machine. The
> root cause is the same thing like 2).
 
Thanks for subsequently porting and testing the ACL patch. I saw some
performance gains on reaim on 2 socket 8 core machine, although it
would depend on the workfile used.

Thanks,
Nick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-04  8:50           ` Zhang, Yanmin
@ 2010-08-05 10:57             ` Nick Piggin
  0 siblings, 0 replies; 11+ messages in thread
From: Nick Piggin @ 2010-08-05 10:57 UTC (permalink / raw)
  To: Zhang, Yanmin
  Cc: Kleen, Andi, Nick Piggin, Shi, Alex, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org

On Wed, Aug 04, 2010 at 04:50:23PM +0800, Zhang, Yanmin wrote:
> On Wed, 2010-08-04 at 09:06 +0100, Kleen, Andi wrote:
> > > > I believe the latest version of Nick's patchkit has a likely fix for
> > > that.
> > > >
> > > > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-
> > > npiggin.git;a=commitdiff;h=9edd35f9aeafc8a5e1688b84cf4488a94898ca45
> > > 
> > > Thanks Andi. The patch has no ext3 part.
> > 
> > Good point. But perhaps the ext2 patch can be adapted. The ACL code
> > should be similar in ext2 and ext3 (and 4)
> I ported ext2 part to ext3. aim7 testing on Nehalem EX 4 socket machine
> shows the regression disappears.

Thanks, this looks fine I'll port several more of the popular
filesystems over asap.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-04  1:04   ` scalability investigation: Where can I get your latest patches? Zhang, Yanmin
  2010-08-04  7:21     ` Kleen, Andi
  2010-08-05 10:55     ` Nick Piggin
@ 2010-08-05 11:44     ` Nick Piggin
  2010-08-09  2:36       ` Zhang, Yanmin
  2 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2010-08-05 11:44 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Nick Piggin, andi.kleen, alexs.shi, linux-mm, linux-fsdevel

On Wed, Aug 04, 2010 at 09:04:03AM +0800, Zhang, Yanmin wrote:
> We ran lots of benchmarks on many machines. Below is something to
> share with you.
> 
> Improvement:
> 1) We get about 30% improvement with kbuild workload on Nehalem
> machines. It's hard to improve kbuild performance. Your tree does.
> 
> Issues:
> 1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.
> 2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
> a machine. Start 4 dbench threads per disk.  We run the workload under
> a regular user account. If we run it under root account, we get 22%
> improvement instead of regression.  The root cause is ACL checking.
> With your patch, do_path_lookup firstly goes through rcu steps which
> including a exec permission checking. With ACL, the __exec_permission
> always fails. Then a later nameidata_drop_rcu often fails as
> dentry->d_seq is changed.

Oh one other thing I wanted to ask about. d_seq changing should not
be too common. If the directory is renamed, or if it is turned negative
should be the only cases in which we should see a d_seq changes.

Or unless there is a bug and it is checking the wrong sequence or
against the wrong dentry. How often would you say nameidata_drop_rcu
fails (without the following acl rcu patches)?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-05 10:55     ` Nick Piggin
@ 2010-08-09  2:11       ` Zhang, Yanmin
  2010-08-09  3:20       ` Zhang, Yanmin
  1 sibling, 0 replies; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-09  2:11 UTC (permalink / raw)
  To: Nick Piggin; +Cc: andi.kleen, alexs.shi, linux-mm, linux-fsdevel

On Thu, 2010-08-05 at 20:55 +1000, Nick Piggin wrote:
> On Wed, Aug 04, 2010 at 09:04:03AM +0800, Zhang, Yanmin wrote:
> > On Tue, 2010-07-20 at 13:12 +1000, Nick Piggin wrote:
> > > On Thu, Jul 08, 2010 at 04:56:27PM +0800, Zhang, Yanmin wrote:
> > > > Nick,
> > > > 
> > > > I work with Andi Kleen and Tim to investigate some scalability issues.
> > > > 
> > > > Andi gave me a pointer at:
> > > > http://thread.gmane.org/gmane.linux.kernel/1002380/focus=42284
> > > > 
> > > > Where can I get your latest patches? It's better if I could get patch tarball.
> > > > 
> > > > Thanks,
> > > > Yanmin
> > > > 
> > > 
> > > Hi Yanmin,
> > > 
> > > Sorry for the delay. I have a git tree now, and it has been through
> > > some tress testing.
> > > 
> > > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git
> > > 
> > > I would be very interested to know if you encounter problems or are
> > > able to generate any benchmark numbers.
> > Nick,
> > 
> > We ran lots of benchmarks on many machines. Below is something to
> > share with you.
> 
> Great, thanks for doing this!
> 
>  
> > Improvement:
> > 1) We get about 30% improvement with kbuild workload on Nehalem
> > machines. It's hard to improve kbuild performance. Your tree does.
> 
> Well that's nice. What size of machine is this?
It's a dual-socket Nehalem machine with 2*4*2 logical cpus and 6GB memory.

>  Did you run it on an
> ACL enabled filesystem?
Yes. The root filesystem is ext3 ACL.

> 
> 
> > Issues:
> > 1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.
> 
> Yes there are a couple that broke, which I still need to fix up.
> 
> 
> > 2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
> > a machine. Start 4 dbench threads per disk.  We run the workload under
> > a regular user account. If we run it under root account, we get 22%
> > improvement instead of regression.  The root cause is ACL checking.
> > With your patch, do_path_lookup firstly goes through rcu steps which
> > including a exec permission checking. With ACL, the __exec_permission
> > always fails. Then a later nameidata_drop_rcu often fails as
> > dentry->d_seq is changed.
> > 
> > With root account, it doesn't happen. We mount the working devices
> > under /mnt/stp/XXX.  /mnt is of root user. So the exec permission
> > check is ok.
> 
> Yes if running with root, this should have the same effect as the
> rcu-walk aware ACL patch. BTW. dbench has a nasty call to statvfs()
> which is a huge cost (which should be fixed in future versions of
> kernel+glibc). You can try switching the statvfs(2) call in fileio.c
> to statfs(2) and see if performance improves.
> 
> Are you disk bound or CPU bound at this point?
CPU bound.

> 
> > I remount all file systems on the testing path with noacl option, and
> > get the similar results like under root account.
> > 
> > 3) aim7 has about 40% regression on Nehalem EX 4-socket machine. The
> > root cause is the same thing like 2).
>  
> Thanks for subsequently porting and testing the ACL patch. I saw some
> performance gains on reaim on 2 socket 8 core machine, although it
> would depend on the workfile used.
I don't find your patch has much impact on aim7 workload on dual-socket
machine, but do on 4-socket and 8-socket Nehalem EX machines.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-05 11:44     ` Nick Piggin
@ 2010-08-09  2:36       ` Zhang, Yanmin
  0 siblings, 0 replies; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-09  2:36 UTC (permalink / raw)
  To: Nick Piggin; +Cc: andi.kleen, alex.shi, linux-mm, linux-fsdevel

On Thu, 2010-08-05 at 21:44 +1000, Nick Piggin wrote:
> On Wed, Aug 04, 2010 at 09:04:03AM +0800, Zhang, Yanmin wrote:
> > We ran lots of benchmarks on many machines. Below is something to
> > share with you.
> > 
> > Improvement:
> > 1) We get about 30% improvement with kbuild workload on Nehalem
> > machines. It's hard to improve kbuild performance. Your tree does.
> > 
> > Issues:
> > 1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.
> > 2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
> > a machine. Start 4 dbench threads per disk.  We run the workload under
> > a regular user account. If we run it under root account, we get 22%
> > improvement instead of regression.  The root cause is ACL checking.
> > With your patch, do_path_lookup firstly goes through rcu steps which
> > including a exec permission checking. With ACL, the __exec_permission
> > always fails. Then a later nameidata_drop_rcu often fails as
> > dentry->d_seq is changed.
> 
> Oh one other thing I wanted to ask about. d_seq changing should not
> be too common. If the directory is renamed, or if it is turned negative
> should be the only cases in which we should see a d_seq changes.
> 
> Or unless there is a bug and it is checking the wrong sequence or
> against the wrong dentry. 
Sorry for misleading you. It fails at the beginning in nameidata_drop_rcu
because (nd->flags & LOOKUP_FIRST) is true.

> How often would you say nameidata_drop_rcu
> fails (without the following acl rcu patches)?
I instrument kernel and find nameidata_drop_rcu always fails.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scalability investigation: Where can I get your latest patches?
  2010-08-05 10:55     ` Nick Piggin
  2010-08-09  2:11       ` Zhang, Yanmin
@ 2010-08-09  3:20       ` Zhang, Yanmin
  1 sibling, 0 replies; 11+ messages in thread
From: Zhang, Yanmin @ 2010-08-09  3:20 UTC (permalink / raw)
  To: Nick Piggin; +Cc: andi.kleen, alexs.shi, linux-mm, linux-fsdevel

On Thu, 2010-08-05 at 20:55 +1000, Nick Piggin wrote:
> On Wed, Aug 04, 2010 at 09:04:03AM +0800, Zhang, Yanmin wrote:
> > On Tue, 2010-07-20 at 13:12 +1000, Nick Piggin wrote:
> > > On Thu, Jul 08, 2010 at 04:56:27PM +0800, Zhang, Yanmin wrote:
> > > > Nick,
> > > > 
> > > > I work with Andi Kleen and Tim to investigate some scalability issues.
> > > > 
> > > > Andi gave me a pointer at:
> > > > http://thread.gmane.org/gmane.linux.kernel/1002380/focus=42284
> > > > 
> > > > Where can I get your latest patches? It's better if I could get patch tarball.
> > > > 
> > > > Thanks,
> > > > Yanmin
> > > > 
> > > 
> > > Hi Yanmin,
> > > 
> > > Sorry for the delay. I have a git tree now, and it has been through
> > > some tress testing.
> > > 
> > > http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git
> > > 
> > > I would be very interested to know if you encounter problems or are
> > > able to generate any benchmark numbers.
> > Nick,
> > 
> > We ran lots of benchmarks on many machines. Below is something to
> > share with you.
> 
> Great, thanks for doing this!
> 
>  
> > Improvement:
> > 1) We get about 30% improvement with kbuild workload on Nehalem
> > machines. It's hard to improve kbuild performance. Your tree does.
> 
> Well that's nice. What size of machine is this? Did you run it on an
> ACL enabled filesystem?
> 
> 
> > Issues:
> > 1) Compiling fails on a couple of file systems, such like CONFIG_ISO9660_FS=y.
> 
> Yes there are a couple that broke, which I still need to fix up.
> 
> 
> > 2) dbenchthreads has about 50% regression. We connect a JBOD of 12 disks to
> > a machine. Start 4 dbench threads per disk.  We run the workload under
> > a regular user account. If we run it under root account, we get 22%
> > improvement instead of regression.  The root cause is ACL checking.
> > With your patch, do_path_lookup firstly goes through rcu steps which
> > including a exec permission checking. With ACL, the __exec_permission
> > always fails. Then a later nameidata_drop_rcu often fails as
> > dentry->d_seq is changed.
> > 
> > With root account, it doesn't happen. We mount the working devices
> > under /mnt/stp/XXX.  /mnt is of root user. So the exec permission
> > check is ok.
> 
> Yes if running with root, this should have the same effect as the
> rcu-walk aware ACL patch. 

> BTW. dbench has a nasty call to statvfs()
> which is a huge cost (which should be fixed in future versions of
> kernel+glibc). You can try switching the statvfs(2) call in fileio.c
> to statfs(2) and see if performance improves.
I change the statvfs call to statfs, and get 40% improvement with your patch.
pure 2.6.35-rc5 also gets 40% improvement.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-08-09  3:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1278579387.2096.889.camel@ymzhang.sh.intel.com>
     [not found] ` <20100720031201.GC21274@amd>
2010-08-04  1:04   ` scalability investigation: Where can I get your latest patches? Zhang, Yanmin
2010-08-04  7:21     ` Kleen, Andi
2010-08-04  7:58       ` Zhang, Yanmin
2010-08-04  8:06         ` Kleen, Andi
2010-08-04  8:50           ` Zhang, Yanmin
2010-08-05 10:57             ` Nick Piggin
2010-08-05 10:55     ` Nick Piggin
2010-08-09  2:11       ` Zhang, Yanmin
2010-08-09  3:20       ` Zhang, Yanmin
2010-08-05 11:44     ` Nick Piggin
2010-08-09  2:36       ` Zhang, Yanmin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).