* [PATCH] Minor ext3 speedup
@ 2005-01-13 12:15 Jan Kara
2005-01-15 17:04 ` Matthew Wilcox
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2005-01-13 12:15 UTC (permalink / raw)
To: akpm; +Cc: Andreas Dilger, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 281 bytes --]
Hello Andrew!
Attached patch removes unnecessary division and modulo from ext3 code
paths. It reduces (according to oprofile) the CPU usage measurably under
a dbench load (see description of the patch for the numbers).
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
[-- Attachment #2: ext3-speedup-2.6.10-linus.diff --]
[-- Type: text/plain, Size: 1275 bytes --]
Remove unnecessary division and modulo from ext3 code in often used paths.
Without the patch an oprofile of dbench load shows ext3_get_group_desc()
uses 0.84% and ext3_group_sparse() 1.51%, with the patch the numbers are
0.33% and 0.27% respectively.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
--- linux/fs/ext3/balloc.c 2005-01-13 13:31:42.000000000 +0100
+++ linux/fs/ext3/balloc.c 2005-01-13 14:59:13.069702736 +0100
@@ -56,8 +56,8 @@
}
smp_rmb();
- group_desc = block_group / EXT3_DESC_PER_BLOCK(sb);
- desc = block_group % EXT3_DESC_PER_BLOCK(sb);
+ group_desc = block_group >> EXT3_DESC_PER_BLOCK_BITS(sb);
+ desc = block_group & (EXT3_DESC_PER_BLOCK(sb) - 1);
if (!EXT3_SB(sb)->s_group_desc[group_desc]) {
ext3_error (sb, "ext3_get_group_desc",
"Group descriptor not loaded - "
@@ -1440,19 +1440,17 @@
static inline int test_root(int a, int b)
{
- if (a == 0)
- return 1;
- while (1) {
- if (a == 1)
- return 1;
- if (a % b)
- return 0;
- a = a / b;
- }
+ int num = b;
+
+ while (a > num)
+ num *= b;
+ return num == a;
}
static int ext3_group_sparse(int group)
{
+ if (group <= 1)
+ return 1;
return (test_root(group, 3) || test_root(group, 5) ||
test_root(group, 7));
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Minor ext3 speedup
2005-01-13 12:15 [PATCH] Minor ext3 speedup Jan Kara
@ 2005-01-15 17:04 ` Matthew Wilcox
2005-01-16 9:03 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2005-01-15 17:04 UTC (permalink / raw)
To: Jan Kara; +Cc: akpm, Andreas Dilger, linux-fsdevel
On Thu, Jan 13, 2005 at 01:15:06PM +0100, Jan Kara wrote:
> Attached patch removes unnecessary division and modulo from ext3 code
> paths. It reduces (according to oprofile) the CPU usage measurably under
> a dbench load (see description of the patch for the numbers).
I thought I'd apply Jan & Andreas's patch to ext2 too. While doing so,
I noticed several places where patches had been applied to ext2 and not to
ext3. One of them was even a bugfix. This patch brings ext2/balloc.c
and ext3/balloc.c closer together and fixes the bug. It includes the
aforementioned patch for both ext2 and ext3.
For the curious, the bug occurs when ext3_free_blocks_sb() decides to
"do_more". *pdquot_freed_blocks was updated each time around the loop,
so bg_free_blocks_count was getting over-incremented. I fixed it the
same way it had been fixed in ext2 -- by introducing a group_freed
variable that is reset each time around the loop.
Index: linux-2.6/fs/ext2/balloc.c
===================================================================
RCS file: /var/cvs/linux-2.6/fs/ext2/balloc.c,v
retrieving revision 1.4
diff -u -p -r1.4 balloc.c
--- linux-2.6/fs/ext2/balloc.c 13 Sep 2004 15:23:44 -0000 1.4
+++ linux-2.6/fs/ext2/balloc.c 15 Jan 2005 16:45:10 -0000
@@ -6,7 +6,7 @@
* Laboratoire MASI - Institut Blaise Pascal
* Universite Pierre et Marie Curie (Paris VI)
*
- * Enhanced block allocation by Stephen Tweedie (sct@dcs.ed.ac.uk), 1993
+ * Enhanced block allocation by Stephen Tweedie (sct@redhat.com), 1993
* Big-endian to little-endian byte-swapping/bitmaps by
* David S. Miller (davem@caip.rutgers.edu), 1995
*/
@@ -52,9 +52,9 @@ struct ext2_group_desc * ext2_get_group_
return NULL;
}
-
- group_desc = block_group / EXT2_DESC_PER_BLOCK(sb);
- offset = block_group % EXT2_DESC_PER_BLOCK(sb);
+
+ group_desc = block_group >> EXT2_DESC_PER_BLOCK_BITS(sb);
+ offset = block_group & (EXT2_DESC_PER_BLOCK(sb) - 1);
if (!sbi->s_group_desc[group_desc]) {
ext2_error (sb, "ext2_get_group_desc",
"Group descriptor not loaded - "
@@ -62,7 +62,7 @@ struct ext2_group_desc * ext2_get_group_
block_group, group_desc, offset);
return NULL;
}
-
+
desc = (struct ext2_group_desc *) sbi->s_group_desc[group_desc]->b_data;
if (bh)
*bh = sbi->s_group_desc[group_desc];
@@ -236,12 +236,12 @@ do_more:
for (i = 0, group_freed = 0; i < count; i++) {
if (!ext2_clear_bit_atomic(sb_bgl_lock(sbi, block_group),
- bit + i, (void *) bitmap_bh->b_data))
- ext2_error (sb, "ext2_free_blocks",
- "bit already cleared for block %lu",
- block + i);
- else
+ bit + i, bitmap_bh->b_data)) {
+ ext2_error(sb, __FUNCTION__,
+ "bit already cleared for block %lu", block + i);
+ } else {
group_freed++;
+ }
}
mark_buffer_dirty(bitmap_bh);
@@ -569,25 +569,24 @@ unsigned long ext2_count_free_blocks (st
static inline int
block_in_use(unsigned long block, struct super_block *sb, unsigned char *map)
{
- return ext2_test_bit ((block - le32_to_cpu(EXT2_SB(sb)->s_es->s_first_data_block)) %
+ return ext2_test_bit ((block -
+ le32_to_cpu(EXT2_SB(sb)->s_es->s_first_data_block)) %
EXT2_BLOCKS_PER_GROUP(sb), map);
}
static inline int test_root(int a, int b)
{
- if (a == 0)
- return 1;
- while (1) {
- if (a == 1)
- return 1;
- if (a % b)
- return 0;
- a = a / b;
- }
+ int num = b;
+
+ while (a > num)
+ num *= b;
+ return num == a;
}
static int ext2_group_sparse(int group)
{
+ if (group <= 1)
+ return 1;
return (test_root(group, 3) || test_root(group, 5) ||
test_root(group, 7));
}
Index: linux-2.6/fs/ext3/balloc.c
===================================================================
RCS file: /var/cvs/linux-2.6/fs/ext3/balloc.c,v
retrieving revision 1.9
diff -u -p -r1.9 balloc.c
--- linux-2.6/fs/ext3/balloc.c 12 Jan 2005 20:17:23 -0000 1.9
+++ linux-2.6/fs/ext3/balloc.c 15 Jan 2005 16:45:10 -0000
@@ -43,34 +43,34 @@ struct ext3_group_desc * ext3_get_group_
struct buffer_head ** bh)
{
unsigned long group_desc;
- unsigned long desc;
- struct ext3_group_desc * gdp;
+ unsigned long offset;
+ struct ext3_group_desc * desc;
+ struct ext3_sb_info *sbi = EXT3_SB(sb);
- if (block_group >= EXT3_SB(sb)->s_groups_count) {
+ if (block_group >= sbi->s_groups_count) {
ext3_error (sb, "ext3_get_group_desc",
"block_group >= groups_count - "
"block_group = %d, groups_count = %lu",
- block_group, EXT3_SB(sb)->s_groups_count);
+ block_group, sbi->s_groups_count);
return NULL;
}
smp_rmb();
- group_desc = block_group / EXT3_DESC_PER_BLOCK(sb);
- desc = block_group % EXT3_DESC_PER_BLOCK(sb);
- if (!EXT3_SB(sb)->s_group_desc[group_desc]) {
+ group_desc = block_group >> EXT3_DESC_PER_BLOCK_BITS(sb);
+ offset = block_group & (EXT3_DESC_PER_BLOCK(sb) - 1);
+ if (!sbi->s_group_desc[group_desc]) {
ext3_error (sb, "ext3_get_group_desc",
"Group descriptor not loaded - "
"block_group = %d, group_desc = %lu, desc = %lu",
- block_group, group_desc, desc);
+ block_group, group_desc, offset);
return NULL;
}
- gdp = (struct ext3_group_desc *)
- EXT3_SB(sb)->s_group_desc[group_desc]->b_data;
+ desc = (struct ext3_group_desc *) sbi->s_group_desc[group_desc]->b_data;
if (bh)
- *bh = EXT3_SB(sb)->s_group_desc[group_desc];
- return gdp + desc;
+ *bh = sbi->s_group_desc[group_desc];
+ return desc + offset;
}
/*
@@ -284,14 +284,15 @@ void ext3_free_blocks_sb(handle_t *handl
unsigned long bit;
unsigned long i;
unsigned long overflow;
- struct ext3_group_desc * gdp;
+ struct ext3_group_desc * desc;
struct ext3_super_block * es;
struct ext3_sb_info *sbi;
int err = 0, ret;
+ unsigned group_freed;
*pdquot_freed_blocks = 0;
sbi = EXT3_SB(sb);
- es = EXT3_SB(sb)->s_es;
+ es = sbi->s_es;
if (block < le32_to_cpu(es->s_first_data_block) ||
block + count < block ||
block + count > le32_to_cpu(es->s_blocks_count)) {
@@ -301,7 +302,7 @@ void ext3_free_blocks_sb(handle_t *handl
goto error_return;
}
- ext3_debug ("freeing block %lu\n", block);
+ ext3_debug ("freeing block(s) %lu-%lu\n", block, block + count - 1);
do_more:
overflow = 0;
@@ -321,16 +322,16 @@ do_more:
bitmap_bh = read_block_bitmap(sb, block_group);
if (!bitmap_bh)
goto error_return;
- gdp = ext3_get_group_desc (sb, block_group, &gd_bh);
- if (!gdp)
+ desc = ext3_get_group_desc (sb, block_group, &gd_bh);
+ if (!desc)
goto error_return;
- if (in_range (le32_to_cpu(gdp->bg_block_bitmap), block, count) ||
- in_range (le32_to_cpu(gdp->bg_inode_bitmap), block, count) ||
- in_range (block, le32_to_cpu(gdp->bg_inode_table),
- EXT3_SB(sb)->s_itb_per_group) ||
- in_range (block + count - 1, le32_to_cpu(gdp->bg_inode_table),
- EXT3_SB(sb)->s_itb_per_group))
+ if (in_range (le32_to_cpu(desc->bg_block_bitmap), block, count) ||
+ in_range (le32_to_cpu(desc->bg_inode_bitmap), block, count) ||
+ in_range (block, le32_to_cpu(desc->bg_inode_table),
+ sbi->s_itb_per_group) ||
+ in_range (block + count - 1, le32_to_cpu(desc->bg_inode_table),
+ sbi->s_itb_per_group))
ext3_error (sb, "ext3_free_blocks",
"Freeing blocks in system zones - "
"Block = %lu, count = %lu",
@@ -358,7 +359,7 @@ do_more:
jbd_lock_bh_state(bitmap_bh);
- for (i = 0; i < count; i++) {
+ for (i = 0, group_freed = 0; i < count; i++) {
/*
* An HJ special. This is expensive...
*/
@@ -421,15 +422,15 @@ do_more:
jbd_lock_bh_state(bitmap_bh);
BUFFER_TRACE(bitmap_bh, "bit already cleared");
} else {
- (*pdquot_freed_blocks)++;
+ group_freed++;
}
}
jbd_unlock_bh_state(bitmap_bh);
spin_lock(sb_bgl_lock(sbi, block_group));
- gdp->bg_free_blocks_count =
- cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) +
- *pdquot_freed_blocks);
+ desc->bg_free_blocks_count =
+ cpu_to_le16(le16_to_cpu(desc->bg_free_blocks_count) +
+ group_freed);
spin_unlock(sb_bgl_lock(sbi, block_group));
percpu_counter_mod(&sbi->s_freeblocks_counter, count);
@@ -441,6 +442,7 @@ do_more:
BUFFER_TRACE(gd_bh, "dirtied group descriptor block");
ret = ext3_journal_dirty_metadata(handle, gd_bh);
if (!err) err = ret;
+ *pdquot_freed_blocks += group_freed;
if (overflow && !err) {
block += count;
@@ -1429,9 +1431,8 @@ unsigned long ext3_count_free_blocks(str
#endif
}
-static inline int block_in_use(unsigned long block,
- struct super_block * sb,
- unsigned char * map)
+static inline int
+block_in_use(unsigned long block, struct super_block *sb, unsigned char *map)
{
return ext3_test_bit ((block -
le32_to_cpu(EXT3_SB(sb)->s_es->s_first_data_block)) %
@@ -1440,19 +1441,17 @@ static inline int block_in_use(unsigned
static inline int test_root(int a, int b)
{
- if (a == 0)
- return 1;
- while (1) {
- if (a == 1)
- return 1;
- if (a % b)
- return 0;
- a = a / b;
- }
+ int num = b;
+
+ while (a > num)
+ num *= b;
+ return num == a;
}
static int ext3_group_sparse(int group)
{
+ if (group <= 1)
+ return 1;
return (test_root(group, 3) || test_root(group, 5) ||
test_root(group, 7));
}
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Minor ext3 speedup
2005-01-15 17:04 ` Matthew Wilcox
@ 2005-01-16 9:03 ` Andrew Morton
0 siblings, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2005-01-16 9:03 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: jack, adilger, linux-fsdevel
Matthew Wilcox <matthew@wil.cx> wrote:
>
> On Thu, Jan 13, 2005 at 01:15:06PM +0100, Jan Kara wrote:
> > Attached patch removes unnecessary division and modulo from ext3 code
> > paths. It reduces (according to oprofile) the CPU usage measurably under
> > a dbench load (see description of the patch for the numbers).
>
> I thought I'd apply Jan & Andreas's patch to ext2 too. While doing so,
> I noticed several places where patches had been applied to ext2 and not to
> ext3. One of them was even a bugfix. This patch brings ext2/balloc.c
> and ext3/balloc.c closer together and fixes the bug. It includes the
> aforementioned patch for both ext2 and ext3.
>
> For the curious, the bug occurs when ext3_free_blocks_sb() decides to
> "do_more". *pdquot_freed_blocks was updated each time around the loop,
> so bg_free_blocks_count was getting over-incremented. I fixed it the
> same way it had been fixed in ext2 -- by introducing a group_freed
> variable that is reset each time around the loop.
The patch does boatloads of other things too.
Matthew. You know the routine ;)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-01-16 9:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-13 12:15 [PATCH] Minor ext3 speedup Jan Kara
2005-01-15 17:04 ` Matthew Wilcox
2005-01-16 9:03 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).