linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient
@ 2013-03-27 11:04 Lukas Czerner
  2013-03-27 11:04 ` [PATCH 2/2 v2] ext4: introduce ext4_get_group_number() Lukas Czerner
  2013-03-27 13:34 ` [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Theodore Ts'o
  0 siblings, 2 replies; 3+ messages in thread
From: Lukas Czerner @ 2013-03-27 11:04 UTC (permalink / raw)
  To: linux-ext4; +Cc: tytso, Lukas Czerner

Currently in when getting the block group number for a particular block
in ext4_block_in_group() we're using ext4_get_group_no_and_offset()
which uses do_div() to get the block group and the remainer which is
offset within the group.

We don't need all of that in ext4_block_in_group() as we only need to
figure out the group number.

This commit changes ext4_block_in_group() to calculate group number
directly. This shows as a big improvement with regards to cpu
utilization. Measuring fallocate -l 15T on fresh file system with perf
showed that 23% of cpu time was spend in the
ext4_get_group_no_and_offset(). With this change it completely
disappears from the list only bumping the occurrence of
ext4_init_block_bitmap() which is the biggest user of
ext4_block_in_group() by 4%. As the result of this change on my system
the fallocate call was approx. 10% faster.

However since there is '-g' option in mkfs which allow us setting
different groups size (mostly for developers) I've introduced new per
file system flag whether we have a standard block group size or not. The
flag is used to determine whether we can use the bit shift optimization
or not.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
v2: Add sbi flag if there is standard group size

 fs/ext4/balloc.c |   22 ++++++++++++++++------
 fs/ext4/ext4.h   |    3 +++
 fs/ext4/super.c  |    4 ++++
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 92e68b3..a4d2e73 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -49,14 +49,24 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr,
 
 }
 
-static int ext4_block_in_group(struct super_block *sb, ext4_fsblk_t block,
-			ext4_group_t block_group)
+/*
+ * Check whether the 'block' lives within the 'block_group'. Returns 1 if so
+ * and 0 otherwise.
+ */
+static inline int ext4_block_in_group(struct super_block *sb,
+				      ext4_fsblk_t block,
+				      ext4_group_t block_group)
 {
 	ext4_group_t actual_group;
-	ext4_get_group_no_and_offset(sb, block, &actual_group, NULL);
-	if (actual_group == block_group)
-		return 1;
-	return 0;
+
+	if (test_opt(sb, STD_GROUP_SIZE))
+		actual_group =
+			(le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
+			block) >>
+			(EXT4_BLOCK_SIZE_BITS(sb) + EXT4_CLUSTER_BITS(sb) + 3);
+	else
+		ext4_get_group_no_and_offset(sb, block, &actual_group, NULL);
+	return (actual_group == block_group) ? 1 : 0;
 }
 
 /* Return the number of clusters used for file system metadata; this
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 167ff56..2ee8d76 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -983,6 +983,9 @@ struct ext4_inode_info {
 
 #define EXT4_MOUNT2_EXPLICIT_DELALLOC	0x00000001 /* User explicitly
 						      specified delalloc */
+#define EXT4_MOUNT_STD_GROUP_SIZE	0x00000002 /* We have standard group
+						      size of blocksize * 8
+						      blocks */
 
 #define clear_opt(sb, opt)		EXT4_SB(sb)->s_mount_opt &= \
 						~EXT4_MOUNT_##opt
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d1ee6a8..28ea887 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3522,6 +3522,10 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	sbi->s_addr_per_block_bits = ilog2(EXT4_ADDR_PER_BLOCK(sb));
 	sbi->s_desc_per_block_bits = ilog2(EXT4_DESC_PER_BLOCK(sb));
 
+	/* Do we have standard group size of blocksize * 8 blocks ? */
+	if (sbi->s_blocks_per_group == blocksize << 8)
+		set_opt(sb, STD_GROUP_SIZE);
+
 	for (i = 0; i < 4; i++)
 		sbi->s_hash_seed[i] = le32_to_cpu(es->s_hash_seed[i]);
 	sbi->s_def_hash_version = es->s_def_hash_version;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2 v2] ext4: introduce ext4_get_group_number()
  2013-03-27 11:04 [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Lukas Czerner
@ 2013-03-27 11:04 ` Lukas Czerner
  2013-03-27 13:34 ` [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Theodore Ts'o
  1 sibling, 0 replies; 3+ messages in thread
From: Lukas Czerner @ 2013-03-27 11:04 UTC (permalink / raw)
  To: linux-ext4; +Cc: tytso, Lukas Czerner

Currently on many places in ext4 we're using
ext4_get_group_no_and_offset() even though we're only interested in
knowing the block group of the particular block, not the offset within
the block group so we can use more efficient way to compute block group.

This patch introduces ext4_get_group_number() which computes block group
for a given block much more efficiently. Use this function instead of
ext4_get_group_no_and_offset() everywhere where we're only interested in
knowing the block group.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
v2: Update after patch 0001 changed

 fs/ext4/balloc.c  |   25 ++++++++++++++++++-------
 fs/ext4/ext4.h    |    3 +++
 fs/ext4/mballoc.c |    6 +++---
 fs/ext4/resize.c  |   10 +++++-----
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index a4d2e73..062a31dc 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -30,6 +30,23 @@ static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
  */
 
 /*
+ * Calculate block group number for a given block number
+ */
+inline ext4_group_t ext4_get_group_number(struct super_block *sb,
+					  ext4_fsblk_t block)
+{
+	ext4_group_t group;
+
+	if (test_opt(sb, STD_GROUP_SIZE))
+		group = (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
+			 block) >>
+			(EXT4_BLOCK_SIZE_BITS(sb) + EXT4_CLUSTER_BITS(sb) + 3);
+	else
+		ext4_get_group_no_and_offset(sb, block, &group, NULL);
+	return group;
+}
+
+/*
  * Calculate the block group number and offset into the block/cluster
  * allocation bitmap, given a block number
  */
@@ -59,13 +76,7 @@ static inline int ext4_block_in_group(struct super_block *sb,
 {
 	ext4_group_t actual_group;
 
-	if (test_opt(sb, STD_GROUP_SIZE))
-		actual_group =
-			(le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
-			block) >>
-			(EXT4_BLOCK_SIZE_BITS(sb) + EXT4_CLUSTER_BITS(sb) + 3);
-	else
-		ext4_get_group_no_and_offset(sb, block, &actual_group, NULL);
+	actual_group = ext4_get_group_number(sb, block);
 	return (actual_group == block_group) ? 1 : 0;
 }
 
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 2ee8d76..0d7d7749 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1790,6 +1790,9 @@ ext4_group_first_block_no(struct super_block *sb, ext4_group_t group_no)
 void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr,
 			ext4_group_t *blockgrpp, ext4_grpblk_t *offsetp);
 
+inline ext4_group_t ext4_get_group_number(struct super_block *sb,
+					  ext4_fsblk_t block);
+
 /*
  * Timeout and state flag for lazy initialization inode thread.
  */
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index ee6614b..dca7f06 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3342,7 +3342,7 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac,
 	if (pa->pa_type == MB_GROUP_PA)
 		grp_blk--;
 
-	ext4_get_group_no_and_offset(sb, grp_blk, &grp, NULL);
+	grp = ext4_get_group_number(sb, grp_blk);
 
 	/*
 	 * possible race:
@@ -3807,7 +3807,7 @@ repeat:
 
 	list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) {
 		BUG_ON(pa->pa_type != MB_INODE_PA);
-		ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL);
+		group = ext4_get_group_number(sb, pa->pa_pstart);
 
 		err = ext4_mb_load_buddy(sb, group, &e4b);
 		if (err) {
@@ -4069,7 +4069,7 @@ ext4_mb_discard_lg_preallocations(struct super_block *sb,
 
 	list_for_each_entry_safe(pa, tmp, &discard_list, u.pa_tmp_list) {
 
-		ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL);
+		group = ext4_get_group_number(sb, pa->pa_pstart);
 		if (ext4_mb_load_buddy(sb, group, &e4b)) {
 			ext4_error(sb, "Error loading buddy information for %u",
 					group);
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index c169477..e349853 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -272,7 +272,7 @@ next_group:
 		if (start_blk >= last_blk)
 			goto next_group;
 		group_data[bb_index].block_bitmap = start_blk++;
-		ext4_get_group_no_and_offset(sb, start_blk - 1, &group, NULL);
+		group = ext4_get_group_number(sb, start_blk - 1);
 		group -= group_data[0].group;
 		group_data[group].free_blocks_count--;
 		if (flexbg_size > 1)
@@ -284,7 +284,7 @@ next_group:
 		if (start_blk >= last_blk)
 			goto next_group;
 		group_data[ib_index].inode_bitmap = start_blk++;
-		ext4_get_group_no_and_offset(sb, start_blk - 1, &group, NULL);
+		group = ext4_get_group_number(sb, start_blk - 1);
 		group -= group_data[0].group;
 		group_data[group].free_blocks_count--;
 		if (flexbg_size > 1)
@@ -296,7 +296,7 @@ next_group:
 		if (start_blk + EXT4_SB(sb)->s_itb_per_group > last_blk)
 			goto next_group;
 		group_data[it_index].inode_table = start_blk;
-		ext4_get_group_no_and_offset(sb, start_blk, &group, NULL);
+		group = ext4_get_group_number(sb, start_blk - 1);
 		group -= group_data[0].group;
 		group_data[group].free_blocks_count -=
 					EXT4_SB(sb)->s_itb_per_group;
@@ -392,7 +392,7 @@ static int set_flexbg_block_bitmap(struct super_block *sb, handle_t *handle,
 		ext4_group_t group;
 		int err;
 
-		ext4_get_group_no_and_offset(sb, block, &group, NULL);
+		group = ext4_get_group_number(sb, block);
 		start = ext4_group_first_block_no(sb, group);
 		group -= flex_gd->groups[0].group;
 
@@ -1879,7 +1879,7 @@ retry:
 		/* Nothing need to do */
 		return 0;
 
-	ext4_get_group_no_and_offset(sb, n_blocks_count - 1, &n_group, &offset);
+	n_group = ext4_get_group_number(sb, n_blocks_count - 1);
 	ext4_get_group_no_and_offset(sb, o_blocks_count - 1, &o_group, &offset);
 
 	n_desc_blocks = num_desc_blocks(sb, n_group + 1);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient
  2013-03-27 11:04 [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Lukas Czerner
  2013-03-27 11:04 ` [PATCH 2/2 v2] ext4: introduce ext4_get_group_number() Lukas Czerner
@ 2013-03-27 13:34 ` Theodore Ts'o
  1 sibling, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2013-03-27 13:34 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-ext4

On Wed, Mar 27, 2013 at 12:04:04PM +0100, Lukas Czerner wrote:
>  #define EXT4_MOUNT2_EXPLICIT_DELALLOC	0x00000001 /* User explicitly
>  						      specified delalloc */
> +#define EXT4_MOUNT_STD_GROUP_SIZE	0x00000002 /* We have standard group
> +						      size of blocksize * 8
> +						      blocks */

This should be EXT4_MOUNT2_STD_GROUP_SIZE, and then all calls to
set_opt, test_opt, etc. should be set_opt2, test_opt2, etc.  As it
turns out bit 0x0002 isn't used for flags in s_mount_opt family, so
the other fix would be to move this definition to before
EXT4_MOUNT_GRPID in ext4.

However, the EXT4_MOUNT_* options should be reserved for options that
are set via mount options, since we have support for setting bits in
s_mount_opt via a table-driven approach (see how the mount_opt field
in the struct mount_opts in fs/ext4/super.c is used).  So for things
that are set automatically, such as in this case, we should use a bit
assignment for s_mount_opt2.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-03-27 13:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 11:04 [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Lukas Czerner
2013-03-27 11:04 ` [PATCH 2/2 v2] ext4: introduce ext4_get_group_number() Lukas Czerner
2013-03-27 13:34 ` [PATCH 1/2 v2] ext4: Make ext4_block_in_group() much more efficient Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).