linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ext3_group_sparse slow?
@ 2005-01-11 14:48 Jan Kara
  2005-01-11 18:37 ` Andreas Dilger
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2005-01-11 14:48 UTC (permalink / raw)
  To: linux-fsdevel

  Hello!

  I've been profiling dbench load on ext3 on my laptop and it looks as
follows:
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        symbol name
100485   23.2633  __copy_to_user_ll
54497    12.6166  __copy_from_user_ll
34702     8.0339  acpi_processor_idle
15107     3.4974  link_path_walk
8229      1.9051  __d_lookup
6540      1.5141  ext3_group_sparse
6490      1.5025  __might_sleep
4972      1.1511  sysenter_past_esp
3981      0.9216  kmem_cache_alloc
3797      0.8790  ext3_readdir
3617      0.8374  find_get_page
3615      0.8369  ext3_get_group_desc
3589      0.8309  ext3_do_update_inode
...

  What caught my eye is that ext3_group_sparse() and also
ext3_get_group_desc() take so much time. OK, it's only 1.51 and 0.84 percent
of the processor time and it is clearly pretty much dominated by the IO
but anyway I think we need not spend so much processor time by such
trivial operations :) Do you think it would be worth improving?

								Honza

-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ext3_group_sparse slow?
  2005-01-11 14:48 ext3_group_sparse slow? Jan Kara
@ 2005-01-11 18:37 ` Andreas Dilger
  2005-01-12 14:18   ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Dilger @ 2005-01-11 18:37 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 3333 bytes --]

On Jan 11, 2005  15:48 +0100, Jan Kara wrote:
>   I've been profiling dbench load on ext3 on my laptop and it looks as
> follows:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        symbol name
> 100485   23.2633  __copy_to_user_ll
> 54497    12.6166  __copy_from_user_ll
> 34702     8.0339  acpi_processor_idle
> 15107     3.4974  link_path_walk
> 8229      1.9051  __d_lookup
> 6540      1.5141  ext3_group_sparse
> 6490      1.5025  __might_sleep
> 4972      1.1511  sysenter_past_esp
> 3981      0.9216  kmem_cache_alloc
> 3797      0.8790  ext3_readdir
> 3617      0.8374  find_get_page
> 3615      0.8369  ext3_get_group_desc
> 3589      0.8309  ext3_do_update_inode
> ...
> 
> ext3_get_group_desc() take so much time.

The ext3_get_group_desc() code is using a div and mod, which may be
what is slowing things down there.  These could easily be changed to
shift and mask.  The ext3_group_desc struct is an on-disk struct and isn't
changing size any time soon (ever?) and is also a power of two in size.
We already even have s_desc_per_block_bits...  Totally untested patch:


--- 1.27/fs/ext3/balloc.c	2004-11-11 01:34:33 -07:00
+++ edited/fs/ext3/balloc.c	2005-01-11 11:04:55 -07:00
@@ -56,8 +56,8 @@
 	}
 	smp_rmb();
 
-	group_desc = block_group / EXT3_DESC_PER_BLOCK(sb);
-	desc = block_group % EXT3_DESC_PER_BLOCK(sb);
+	group_desc = block_group >> EXT3_DESC_PER_BLOCK_BITS(sb);
+	desc = block_group & (EXT3_DESC_PER_BLOCK(sb) - 1);
 	if (!EXT3_SB(sb)->s_group_desc[group_desc]) {
 		ext3_error (sb, "ext3_get_group_desc",
 			    "Group descriptor not loaded - "
 
> ext3_group_sparse()

As for ext3_group_sparse() this code is very inefficient for large
filesystems as it repeatedly does div and mod operations.  Far better
is to use an incremental method using multiplies (untested patch below).
The ext3_list_backups() function is in ext3/resize.c, so may need an
entry in a header somewhere.

It may be as good to use a table lookup since the number of groups with
backup superblocks in a filesystem is actually very small but given that
CPU speeds are growing much faster than memory speed it might be faster
to do a few multiples (at most 20 for a 16TB filesystem) than to have a
cache miss.

===== fs/ext3/balloc.c 1.27 vs edited =====
--- 1.27/fs/ext3/balloc.c	2004-11-11 01:34:33 -07:00
+++ edited/fs/ext3/balloc.c	2005-01-11 11:27:04 -07:00
@@ -1433,23 +1433,21 @@
 			 EXT3_BLOCKS_PER_GROUP(sb), map);
 }
 
-static inline int test_root(int a, int b)
+int ext3_group_sparse(int group)
 {
-	if (a == 0)
+	unsigned int three = 1, five = 5, seven = 7;
+	unsigned int grp;
+
+	if (!(fs->flags & FL_SPARSE))
 		return 1;
-	while (1) {
-		if (a == 1)
-			return 1;
-		if (a % b)
-			return 0;
-		a = a / b;
-	}
-}
 
-int ext3_group_sparse(int group)
-{
-	return (test_root(group, 3) || test_root(group, 5) ||
-		test_root(group, 7));
+	if (group == 0)
+		return 1;
+
+	while ((grp = ext3_list_backups(fs, &three, &five, &seven)) < group)
+		/* do nothing */;
+
+	return (grp == group);
 }
 
 /**

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/             http://members.shaw.ca/golinux/


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ext3_group_sparse slow?
  2005-01-11 18:37 ` Andreas Dilger
@ 2005-01-12 14:18   ` Jan Kara
  2005-01-12 18:23     ` Andreas Dilger
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2005-01-12 14:18 UTC (permalink / raw)
  To: Andreas Dilger, linux-fsdevel

> On Jan 11, 2005  15:48 +0100, Jan Kara wrote:
> >   I've been profiling dbench load on ext3 on my laptop and it looks as
> > follows:
> > CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> > Profiling through timer interrupt
> > samples  %        symbol name
> > 100485   23.2633  __copy_to_user_ll
> > 54497    12.6166  __copy_from_user_ll
> > 34702     8.0339  acpi_processor_idle
> > 15107     3.4974  link_path_walk
> > 8229      1.9051  __d_lookup
> > 6540      1.5141  ext3_group_sparse
> > 6490      1.5025  __might_sleep
> > 4972      1.1511  sysenter_past_esp
> > 3981      0.9216  kmem_cache_alloc
> > 3797      0.8790  ext3_readdir
> > 3617      0.8374  find_get_page
> > 3615      0.8369  ext3_get_group_desc
> > 3589      0.8309  ext3_do_update_inode
> > ...
> > 
> > ext3_get_group_desc() take so much time.
> 
> The ext3_get_group_desc() code is using a div and mod, which may be
> what is slowing things down there.  These could easily be changed to
> shift and mask.  The ext3_group_desc struct is an on-disk struct and isn't
> changing size any time soon (ever?) and is also a power of two in size.
> We already even have s_desc_per_block_bits...  Totally untested patch:
> 
> --- 1.27/fs/ext3/balloc.c	2004-11-11 01:34:33 -07:00
> +++ edited/fs/ext3/balloc.c	2005-01-11 11:04:55 -07:00
> @@ -56,8 +56,8 @@
>  	}
>  	smp_rmb();
>  
> -	group_desc = block_group / EXT3_DESC_PER_BLOCK(sb);
> -	desc = block_group % EXT3_DESC_PER_BLOCK(sb);
> +	group_desc = block_group >> EXT3_DESC_PER_BLOCK_BITS(sb);
> +	desc = block_group & (EXT3_DESC_PER_BLOCK(sb) - 1);
>  	if (!EXT3_SB(sb)->s_group_desc[group_desc]) {
>  		ext3_error (sb, "ext3_get_group_desc",
>  			    "Group descriptor not loaded - "
  This looks OK.

> > ext3_group_sparse()
> 
> As for ext3_group_sparse() this code is very inefficient for large
> filesystems as it repeatedly does div and mod operations.  Far better
> is to use an incremental method using multiplies (untested patch below).
> The ext3_list_backups() function is in ext3/resize.c, so may need an
> entry in a header somewhere.
> 
> It may be as good to use a table lookup since the number of groups with
> backup superblocks in a filesystem is actually very small but given that
> CPU speeds are growing much faster than memory speed it might be faster
> to do a few multiples (at most 20 for a 16TB filesystem) than to have a
> cache miss.
  I agree with the idea but would not be the following solution better?
It's simplier, you save the function call etc..

--- linux-2.6.10.orig/fs/ext3/balloc.c	2005-01-05 17:19:33.000000000 +0100
+++ linux-2.6.10-ext3speedup/fs/ext3/balloc.c	2005-01-12 13:07:48.000000000 +0100
@@ -1435,19 +1435,17 @@
 
 static inline int test_root(int a, int b)
 {
-	if (a == 0)
-		return 1;
-	while (1) {
-		if (a == 1)
-			return 1;
-		if (a % b)
-			return 0;
-		a = a / b;
-	}
+	int num = b;
+
+	while (a > num)
+		num *= b;
+	return num == a;
 }
 
 int ext3_group_sparse(int group)
 {
+	if (group <= 1)
+		return 1;
 	return (test_root(group, 3) || test_root(group, 5) ||
 		test_root(group, 7));
 }

I profiled your first patch + the above one and ext3_group_sparse()
takes now 0.27% and ext3_get_group_desc() 0.33% which seems much better
:) If you agree I can submit it to Andrew.
								Honza

-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ext3_group_sparse slow?
  2005-01-12 14:18   ` Jan Kara
@ 2005-01-12 18:23     ` Andreas Dilger
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2005-01-12 18:23 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2853 bytes --]

On Jan 12, 2005  15:18 +0100, Jan Kara wrote:
> > On Jan 11, 2005  15:48 +0100, Jan Kara wrote:
> > >   I've been profiling dbench load on ext3 on my laptop and it looks as
> > > follows:
> > > CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> > > Profiling through timer interrupt
> > > samples  %        symbol name
> > > 6540      1.5141  ext3_group_sparse
> > > 3615      0.8369  ext3_get_group_desc
> > 
> > The ext3_get_group_desc() code is using a div and mod, which may be
> > what is slowing things down there.  These could easily be changed to
> > shift and mask.  The ext3_group_desc struct is an on-disk struct and isn't
> > changing size any time soon (ever?) and is also a power of two in size.
> > We already even have s_desc_per_block_bits...  Totally untested patch:
> > 
> > --- 1.27/fs/ext3/balloc.c	2004-11-11 01:34:33 -07:00
> > +++ edited/fs/ext3/balloc.c	2005-01-11 11:04:55 -07:00
> > @@ -56,8 +56,8 @@
> >  	}
> >  	smp_rmb();
> >  
> > -	group_desc = block_group / EXT3_DESC_PER_BLOCK(sb);
> > -	desc = block_group % EXT3_DESC_PER_BLOCK(sb);
> > +	group_desc = block_group >> EXT3_DESC_PER_BLOCK_BITS(sb);
> > +	desc = block_group & (EXT3_DESC_PER_BLOCK(sb) - 1);
> >  	if (!EXT3_SB(sb)->s_group_desc[group_desc]) {
> >  		ext3_error (sb, "ext3_get_group_desc",
> >  			    "Group descriptor not loaded - "
>   This looks OK.
> 
> > > ext3_group_sparse()
> > 
> > As for ext3_group_sparse() this code is very inefficient for large
> > filesystems as it repeatedly does div and mod operations.  Far better
> > is to use an incremental method using multiplies
>
>   I agree with the idea but would not be the following solution better?
> It's simplier, you save the function call etc..
> 
> --- linux-2.6.10.orig/fs/ext3/balloc.c	2005-01-05 17:19:33.000000000 +0100
> +++ linux-2.6.10-ext3speedup/fs/ext3/balloc.c	2005-01-12 13:07:48.000000000 +0100
> @@ -1435,19 +1435,17 @@
>  
>  static inline int test_root(int a, int b)
>  {
> -	if (a == 0)
> -		return 1;
> -	while (1) {
> -		if (a == 1)
> -			return 1;
> -		if (a % b)
> -			return 0;
> -		a = a / b;
> -	}
> +	int num = b;
> +
> +	while (a > num)
> +		num *= b;
> +	return num == a;
>  }
>  
>  int ext3_group_sparse(int group)
>  {
> +	if (group <= 1)
> +		return 1;
>  	return (test_root(group, 3) || test_root(group, 5) ||
>  		test_root(group, 7));
>  }
> 
> I profiled your first patch + the above one and ext3_group_sparse()
> takes now 0.27% and ext3_get_group_desc() 0.33% which seems much better
> :) If you agree I can submit it to Andrew.

Sure, you can add my Signed-off-by: Andreas Dilger <adilger@clusterfs.com>

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/             http://members.shaw.ca/golinux/


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-01-12 20:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-11 14:48 ext3_group_sparse slow? Jan Kara
2005-01-11 18:37 ` Andreas Dilger
2005-01-12 14:18   ` Jan Kara
2005-01-12 18:23     ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).