linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] mballoc patches
@ 2007-08-13 10:22 Aneesh Kumar K.V
  2007-08-13 10:22 ` [PATCH 1/4] Add some new function for searching extent tree Aneesh Kumar K.V
  2007-08-13 16:38 ` [RFC] mballoc patches Aneesh Kumar K.V
  0 siblings, 2 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2007-08-13 10:22 UTC (permalink / raw)
  To: alex; +Cc: linux-ext4


Alex actually pointed me the new mballoc patches at
ftp://ftp.clusterfs.com/pub/people/alex/mballoc3

The series is the forward port of the same on top of
d4ac2477fad0f2680e84ec12e387ce67682c5c13 (v2.6.23-rc2)


I guess the mballoc3 patch at clusterfs.com is based on
a patched ext3(I guess it is ext3 + extent tree local to
clusterfs ? ). The following series is based on ext4.


I tested the changes with different blocks per group
combination and it seems to be working fine. (The last
series did panic with mke2fs -g 800 )


I will now look at splitting this to smaller patches.

NOTE : I am not sure whether patch 2 will be able to make the list.
if not is there a place i can ftp/scp them so that others can access
the same ?

I haven't split the patches in any order and didn't bothered to
write any meaningful commit messages since this is mostly work in
progress.

-aneesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] Add some new  function for searching extent tree.
  2007-08-13 10:22 [RFC] mballoc patches Aneesh Kumar K.V
@ 2007-08-13 10:22 ` Aneesh Kumar K.V
       [not found]   ` <1187000553923-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
  2007-08-13 16:38 ` [RFC] mballoc patches Aneesh Kumar K.V
  1 sibling, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2007-08-13 10:22 UTC (permalink / raw)
  To: alex; +Cc: linux-ext4, Alex Tomas

From: Alex Tomas <alex@clusterfs.com>

ext4_ext_search_left
ext4_ext_search_right
---
 fs/ext4/extents.c               |  142 +++++++++++++++++++++++++++++++++++++++
 include/linux/ext4_fs_extents.h |    2 +
 2 files changed, 144 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 78beb09..3084e09 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1015,6 +1015,148 @@ out:
 }
 
 /*
+ * search the closest allocated block to the left for *logical
+ * and returns it at @logical + it's physical address at @phys
+ * if *logical is the smallest allocated block, the function
+ * returns 0 at @phys
+ * return value contains 0 (success) or error code
+ */
+int
+ext4_ext_search_left(struct inode *inode, struct ext4_ext_path *path,
+			ext4_fsblk_t *logical, ext4_fsblk_t *phys)
+{
+	struct ext4_extent_idx *ix;
+	struct ext4_extent *ex;
+	int depth;
+
+	BUG_ON(path == NULL);
+	depth = path->p_depth;
+	*phys = 0;
+
+	if (depth == 0 && path->p_ext == NULL)
+		return 0;
+
+	/* usually extent in the path covers blocks smaller
+	 * then *logical, but it can be that extent is the
+	 * first one in the file */
+
+	ex = path[depth].p_ext;
+	if (*logical < le32_to_cpu(ex->ee_block)) {
+		BUG_ON(EXT_FIRST_EXTENT(path[depth].p_hdr) != ex);
+		while (--depth >= 0) {
+			ix = path[depth].p_idx;
+			BUG_ON(ix != EXT_FIRST_INDEX(path[depth].p_hdr));
+		}
+		return 0;
+	}
+
+	BUG_ON(*logical < le32_to_cpu(ex->ee_block) + le16_to_cpu(ex->ee_len));
+
+	*logical = le32_to_cpu(ex->ee_block) + le16_to_cpu(ex->ee_len) - 1;
+	*phys = ext_pblock(ex) + le16_to_cpu(ex->ee_len) - 1;
+	return 0;
+}
+
+/*
+ * search the closest allocated block to the right for *logical
+ * and returns it at @logical + it's physical address at @phys
+ * if *logical is the smallest allocated block, the function
+ * returns 0 at @phys
+ * return value contains 0 (success) or error code
+ */
+int
+ext4_ext_search_right(struct inode *inode, struct ext4_ext_path *path,
+			ext4_fsblk_t *logical, ext4_fsblk_t *phys)
+{
+	struct buffer_head *bh = NULL;
+	struct ext4_extent_header *eh;
+	struct ext4_extent_idx *ix;
+	struct ext4_extent *ex;
+	ext4_fsblk_t block;
+	int depth;
+
+	BUG_ON(path == NULL);
+	depth = path->p_depth;
+	*phys = 0;
+
+	if (depth == 0 && path->p_ext == NULL)
+		return 0;
+
+	/* usually extent in the path covers blocks smaller
+	 * then *logical, but it can be that extent is the
+	 * first one in the file */
+
+	ex = path[depth].p_ext;
+	if (*logical < le32_to_cpu(ex->ee_block)) {
+		BUG_ON(EXT_FIRST_EXTENT(path[depth].p_hdr) != ex);
+		while (--depth >= 0) {
+			ix = path[depth].p_idx;
+			BUG_ON(ix != EXT_FIRST_INDEX(path[depth].p_hdr));
+		}
+		*logical = le32_to_cpu(ex->ee_block);
+		*phys = ext_pblock(ex);
+		return 0;
+	}
+
+	BUG_ON(*logical < le32_to_cpu(ex->ee_block) + le16_to_cpu(ex->ee_len));
+
+	if (ex != EXT_LAST_EXTENT(path[depth].p_hdr)) {
+		/* next allocated block in this leaf */
+		ex++;
+		*logical = le32_to_cpu(ex->ee_block);
+		*phys = ext_pblock(ex);
+		return 0;
+	}
+
+	/* go up and search for index to the right */
+	while (--depth >= 0) {
+		ix = path[depth].p_idx;
+		if (ix != EXT_LAST_INDEX(path[depth].p_hdr))
+			break;
+	}
+
+	if (depth < 0) {
+		/* we've gone up to the root and
+		 * found no index to the right */
+		return 0;
+	}
+
+	/* we've found index to the right, let's
+	 * follow it and find the closest allocated
+	 * block to the right */
+	ix++;
+	block = idx_pblock(ix);
+	while (++depth < path->p_depth) {
+		bh = sb_bread(inode->i_sb, block);
+		if (bh == NULL)
+			return -EIO;
+		eh = ext_block_hdr(bh);
+		if (ext4_ext_check_header(inode, eh, depth)) {
+			brelse(bh);
+			return -EIO;
+		}
+		ix = EXT_FIRST_INDEX(eh);
+		block = idx_pblock(ix);
+		brelse(bh);
+	}
+
+	bh = sb_bread(inode->i_sb, block);
+	if (bh == NULL)
+		return -EIO;
+	eh = ext_block_hdr(bh);
+	if (ext4_ext_check_header(inode, eh, depth)) {
+		brelse(bh);
+		return -EIO;
+	}
+	ex = EXT_FIRST_EXTENT(eh);
+	*logical = le32_to_cpu(ex->ee_block);
+	*phys = ext_pblock(ex);
+	brelse(bh);
+	return 0;
+
+}
+
+/*
  * ext4_ext_next_allocated_block:
  * returns allocated block in subsequent extent or EXT_MAX_BLOCK.
  * NOTE: it considers block number from index entry as
diff --git a/include/linux/ext4_fs_extents.h b/include/linux/ext4_fs_extents.h
index 81406f3..5ed0891 100644
--- a/include/linux/ext4_fs_extents.h
+++ b/include/linux/ext4_fs_extents.h
@@ -236,5 +236,7 @@ extern int ext4_ext_insert_extent(handle_t *, struct inode *, struct ext4_ext_pa
 extern int ext4_ext_walk_space(struct inode *, unsigned long, unsigned long, ext_prepare_callback, void *);
 extern struct ext4_ext_path * ext4_ext_find_extent(struct inode *, int, struct ext4_ext_path *);
 
+extern int ext4_ext_search_left(struct inode *, struct ext4_ext_path *, ext4_fsblk_t *, ext4_fsblk_t *);
+extern int ext4_ext_search_right(struct inode *, struct ext4_ext_path *, ext4_fsblk_t *, ext4_fsblk_t *);
 #endif /* _LINUX_EXT4_EXTENTS */
 
-- 
1.5.3.rc4.67.gf9286-dirty

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] This is the equivalent of ext3-mballoc3-sles10.patch
       [not found]   ` <1187000553923-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
@ 2007-08-13 10:22     ` Aneesh Kumar K.V
  2007-08-13 10:22       ` [PATCH 4/4] Fixes to make it build and run Aneesh Kumar K.V
  0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2007-08-13 10:22 UTC (permalink / raw)
  To: alex; +Cc: linux-ext4, Aneesh Kumar K.V

---
 fs/ext4/Makefile          |    2 +-
 fs/ext4/balloc.c          |   58 ++++++++++++++++++++++++++++++++++-----------
 fs/ext4/extents.c         |   44 +++++++++++++++++++++++++++-------
 fs/ext4/inode.c           |   14 +++++-----
 fs/ext4/super.c           |   18 ++++++++++++++
 fs/ext4/xattr.c           |    4 +-
 include/linux/ext4_fs.h   |    1 +
 include/linux/ext4_fs_i.h |    4 +++
 8 files changed, 112 insertions(+), 33 deletions(-)

diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index ae6e7e5..c7801ab 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -6,7 +6,7 @@ obj-$(CONFIG_EXT4DEV_FS) += ext4dev.o
 
 ext4dev-y	:= balloc.o bitmap.o dir.o file.o fsync.o ialloc.o inode.o \
 		   ioctl.o namei.o super.o symlink.o hash.o resize.o extents.o \
-		   ext4_jbd2.o
+		   ext4_jbd2.o mballoc.o
 
 ext4dev-$(CONFIG_EXT4DEV_FS_XATTR)	+= xattr.o xattr_user.o xattr_trusted.o
 ext4dev-$(CONFIG_EXT4DEV_FS_POSIX_ACL)	+= acl.o
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index e53b4af..55f4be8 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -110,7 +110,7 @@ struct ext4_group_desc * ext4_get_group_desc(struct super_block * sb,
  *
  * Return buffer_head on success or NULL in case of failure.
  */
-static struct buffer_head *
+struct buffer_head *
 read_block_bitmap(struct super_block *sb, unsigned int block_group)
 {
 	struct ext4_group_desc * desc;
@@ -412,6 +412,8 @@ void ext4_discard_reservation(struct inode *inode)
 	struct ext4_reserve_window_node *rsv;
 	spinlock_t *rsv_lock = &EXT4_SB(inode->i_sb)->s_rsv_window_lock;
 
+	ext4_mb_discard_inode_preallocations(inode);
+
 	if (!block_i)
 		return;
 
@@ -617,21 +619,29 @@ error_return:
  * @inode:		inode
  * @block:		start physical block to free
  * @count:		number of blocks to count
+ * @metadata: 		Are these metadata blocks
  */
 void ext4_free_blocks(handle_t *handle, struct inode *inode,
-			ext4_fsblk_t block, unsigned long count)
+			ext4_fsblk_t block, unsigned long count,
+			int metadata)
 {
 	struct super_block * sb;
-	unsigned long dquot_freed_blocks;
+	int freed;
+
+	/* this isn't the right place to decide whether block is metadata
+	 * inode.c/extents.c knows better, but for safety ... */
+	if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode) ||
+			ext4_should_journal_data(inode))
+		metadata = 1;
 
 	sb = inode->i_sb;
-	if (!sb) {
-		printk ("ext4_free_blocks: nonexistent device");
-		return;
-	}
-	ext4_free_blocks_sb(handle, sb, block, count, &dquot_freed_blocks);
-	if (dquot_freed_blocks)
-		DQUOT_FREE_BLOCK(inode, dquot_freed_blocks);
+
+	if (!test_opt(sb, MBALLOC) || !EXT4_SB(sb)->s_group_info)
+		ext4_free_blocks_sb(handle, sb, block, count, &freed);
+	else
+		ext4_mb_free_blocks(handle, inode, block, count, metadata, &freed);
+	if (freed)
+		DQUOT_FREE_BLOCK(inode, freed);
 	return;
 }
 
@@ -1420,7 +1430,7 @@ int ext4_should_retry_alloc(struct super_block *sb, int *retries)
  * any specific goal block.
  *
  */
-ext4_fsblk_t ext4_new_blocks(handle_t *handle, struct inode *inode,
+ext4_fsblk_t ext4_new_blocks_old(handle_t *handle, struct inode *inode,
 			ext4_fsblk_t goal, unsigned long *count, int *errp)
 {
 	struct buffer_head *bitmap_bh = NULL;
@@ -1681,11 +1691,31 @@ out:
 }
 
 ext4_fsblk_t ext4_new_block(handle_t *handle, struct inode *inode,
-			ext4_fsblk_t goal, int *errp)
+		ext4_fsblk_t goal, int *errp)
 {
-	unsigned long count = 1;
+	struct ext4_allocation_request ar;
+	ext4_fsblk_t ret;
+
+	if (!test_opt(inode->i_sb, MBALLOC)) {
+		unsigned long count = 1;
+		ret = ext4_new_blocks_old(handle, inode, goal, &count, errp);
+		return ret;
+	}
+
+	ar.inode = inode;
+	ar.goal = goal;
+	ar.len = 1;
+	ar.logical = 0;
+	ar.lleft = 0;
+	ar.pleft = 0;
+	ar.lright = 0;
+	ar.pright = 0;
+	ar.flags = 0;
+	ret = ext4_mb_new_blocks(handle, &ar, errp);
+	return ret;
+}
+
 
-	return ext4_new_blocks(handle, inode, goal, &count, errp);
 }
 
 /**
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 3084e09..8d163d7 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -851,7 +851,7 @@ cleanup:
 		for (i = 0; i < depth; i++) {
 			if (!ablocks[i])
 				continue;
-			ext4_free_blocks(handle, inode, ablocks[i], 1);
+			ext4_free_blocks(handle, inode, ablocks[i], 1, 1);
 		}
 	}
 	kfree(ablocks);
@@ -1800,7 +1800,7 @@ int ext4_ext_rm_idx(handle_t *handle, struct inode *inode,
 	ext_debug("index is empty, remove it, free block %llu\n", leaf);
 	bh = sb_find_get_block(inode->i_sb, leaf);
 	ext4_forget(handle, 1, inode, bh, leaf);
-	ext4_free_blocks(handle, inode, leaf, 1);
+	ext4_free_blocks(handle, inode, leaf, 1, 1);
 	return err;
 }
 
@@ -1861,8 +1861,10 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
 {
 	struct buffer_head *bh;
 	unsigned short ee_len =  ext4_ext_get_actual_len(ex);
-	int i;
+	int i, metadata = 0;
 
+	if (S_ISDIR(tree->inode->i_mode) || S_ISLNK(tree->inode->i_mode))
+		metadata = 1;
 #ifdef EXTENTS_STATS
 	{
 		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
@@ -1890,7 +1892,7 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
 			bh = sb_find_get_block(inode->i_sb, start + i);
 			ext4_forget(handle, 0, inode, bh, start + i);
 		}
-		ext4_free_blocks(handle, inode, start, num);
+		ext4_free_blocks(handle, inode, start, num, metadata);
 	} else if (from == le32_to_cpu(ex->ee_block)
 		   && to <= le32_to_cpu(ex->ee_block) + ee_len - 1) {
 		printk("strange request: removal %lu-%lu from %u:%u\n",
@@ -2380,6 +2382,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 	ext4_fsblk_t goal, newblock;
 	int err = 0, depth, ret;
 	unsigned long allocated = 0;
+	struct ext4_allocation_request ar;
 
 	__clear_bit(BH_New, &bh_result->b_state);
 	ext_debug("blocks %d/%lu requested for inode %u\n", (int) iblock,
@@ -2491,8 +2494,16 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 	if (S_ISREG(inode->i_mode) && (!EXT4_I(inode)->i_block_alloc_info))
 		ext4_init_block_alloc_info(inode);
 
-	/* allocate new block */
-	goal = ext4_ext_find_goal(inode, path, iblock);
+	/* find neighbour allocated blocks */
+	ar.lleft = iblock;
+	err = ext4_ext_search_left(&tree, path, &ar.lleft, &ar.pleft);
+	if (err)
+		goto out2;
+	ar.lright = iblock;
+	err = ext4_ext_search_right(&tree, path, &ar.lright, &ar.pright);
+	if (err)
+		goto out2;
+	/* FIXME!! allocated is updated with resepec to ar.pright */
 
 	/*
 	 * See if request is beyond maximum number of blocks we can have in
@@ -2507,6 +2518,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 		 create == EXT4_CREATE_UNINITIALIZED_EXT)
 		max_blocks = EXT_UNINIT_MAX_LEN;
 
+
 	/* Check if we can really insert (iblock)::(iblock+max_blocks) extent */
 	newex.ee_block = cpu_to_le32(iblock);
 	newex.ee_len = cpu_to_le16(max_blocks);
@@ -2515,7 +2527,14 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 		allocated = le16_to_cpu(newex.ee_len);
 	else
 		allocated = max_blocks;
-	newblock = ext4_new_blocks(handle, inode, goal, &allocated, &err);
+
+	/* allocate new block */
+	ar.inode = inode;
+	ar.goal = ext4_ext_find_goal(inode, path, iblock);
+	ar.logical = iblock;
+	ar.len = allocated;
+	ar.flags = EXT4_MB_HINT_DATA;
+	newblock = ext4_mb_new_blocks(handle, &ar, &err);
 	if (!newblock)
 		goto out2;
 	ext_debug("allocate new block: goal %llu, found %llu/%lu\n",
@@ -2523,14 +2542,17 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 
 	/* try to insert new extent into found leaf and return */
 	ext4_ext_store_pblock(&newex, newblock);
-	newex.ee_len = cpu_to_le16(allocated);
+	newex.ee_len = cpu_to_le16(ar.len);
 	if (create == EXT4_CREATE_UNINITIALIZED_EXT)  /* Mark uninitialized */
 		ext4_ext_mark_uninitialized(&newex);
 	err = ext4_ext_insert_extent(handle, inode, path, &newex);
 	if (err) {
 		/* free data blocks we just allocated */
+		/* not a good idea to call discard here directly,
+		 * but otherwise we'd need to call it every free() */
+		ext4_mb_discard_inode_preallocations(inode);
 		ext4_free_blocks(handle, inode, ext_pblock(&newex),
-					le16_to_cpu(newex.ee_len));
+					le16_to_cpu(newex.ee_len), 0);
 		goto out2;
 	}
 
@@ -2539,6 +2561,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 
 	/* previous routine could use block we allocated */
 	newblock = ext_pblock(&newex);
+	allocated = newex.ee_len;
 outnew:
 	__set_bit(BH_New, &bh_result->b_state);
 
@@ -2592,6 +2615,9 @@ void ext4_ext_truncate(struct inode * inode, struct page *page)
 	mutex_lock(&EXT4_I(inode)->truncate_mutex);
 	ext4_ext_invalidate_cache(inode);
 
+	/* it's important to discard preallocations under truncate_mutex */
+	ext4_mb_discard_inode_preallocations(inode);
+
 	/*
 	 * TODO: optimization is possible here.
 	 * Probably we need not scan at all,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a4848e0..38b8d19 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -559,7 +559,7 @@ static int ext4_alloc_blocks(handle_t *handle, struct inode *inode,
 	return ret;
 failed_out:
 	for (i = 0; i <index; i++)
-		ext4_free_blocks(handle, inode, new_blocks[i], 1);
+		ext4_free_blocks(handle, inode, new_blocks[i], 1, 0);
 	return ret;
 }
 
@@ -658,9 +658,9 @@ failed:
 		ext4_journal_forget(handle, branch[i].bh);
 	}
 	for (i = 0; i <indirect_blks; i++)
-		ext4_free_blocks(handle, inode, new_blocks[i], 1);
+		ext4_free_blocks(handle, inode, new_blocks[i], 1, 0);
 
-	ext4_free_blocks(handle, inode, new_blocks[i], num);
+	ext4_free_blocks(handle, inode, new_blocks[i], num, 0);
 
 	return err;
 }
@@ -757,9 +757,9 @@ err_out:
 	for (i = 1; i <= num; i++) {
 		BUFFER_TRACE(where[i].bh, "call jbd2_journal_forget");
 		ext4_journal_forget(handle, where[i].bh);
-		ext4_free_blocks(handle,inode,le32_to_cpu(where[i-1].key),1);
+		ext4_free_blocks(handle,inode,le32_to_cpu(where[i-1].key),1, 0);
 	}
-	ext4_free_blocks(handle, inode, le32_to_cpu(where[num].key), blks);
+	ext4_free_blocks(handle, inode, le32_to_cpu(where[num].key), blks, 0);
 
 	return err;
 }
@@ -1990,7 +1990,7 @@ static void ext4_clear_blocks(handle_t *handle, struct inode *inode,
 		}
 	}
 
-	ext4_free_blocks(handle, inode, block_to_free, count);
+	ext4_free_blocks(handle, inode, block_to_free, count, 0);
 }
 
 /**
@@ -2163,7 +2163,7 @@ static void ext4_free_branches(handle_t *handle, struct inode *inode,
 				ext4_journal_test_restart(handle, inode);
 			}
 
-			ext4_free_blocks(handle, inode, nr, 1);
+			ext4_free_blocks(handle, inode, nr, 1, 1);
 
 			if (parent_bh) {
 				/*
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index edc2ef7..b95a1b2 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -739,6 +739,7 @@ enum {
 	Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_quota, Opt_noquota,
 	Opt_ignore, Opt_barrier, Opt_err, Opt_resize, Opt_usrquota,
 	Opt_grpquota, Opt_extents, Opt_noextents,
+	Opt_mballoc, Opt_nomballoc, Opt_stripe,
 };
 
 static match_table_t tokens = {
@@ -790,6 +791,9 @@ static match_table_t tokens = {
 	{Opt_barrier, "barrier=%u"},
 	{Opt_extents, "extents"},
 	{Opt_noextents, "noextents"},
+	{Opt_mballoc, "mballoc"},
+	{Opt_nomballoc, "nomballoc"},
+	{Opt_stripe, "stripe=%u"},
 	{Opt_err, NULL},
 	{Opt_resize, "resize"},
 };
@@ -1128,6 +1132,19 @@ clear_qf_name:
 		case Opt_noextents:
 			clear_opt (sbi->s_mount_opt, EXTENTS);
 			break;
+		case Opt_mballoc:
+			set_opt(sbi->s_mount_opt, MBALLOC);
+			break;
+		case Opt_nomballoc:
+			clear_opt(sbi->s_mount_opt, MBALLOC);
+			break;
+		case Opt_stripe:
+			if (match_int(&args[0], &option))
+				return 0;
+			if (option < 0)
+				return 0;
+			sbi->s_stripe = option;
+			break;
 		default:
 			printk (KERN_ERR
 				"EXT4-fs: Unrecognized mount option \"%s\" "
@@ -1926,6 +1943,7 @@ static int ext4_fill_super (struct super_block *sb, void *data, int silent)
 		"writeback");
 
 	ext4_ext_init(sb);
+	ext4_mb_init(sb, needs_recovery);
 
 	lock_kernel();
 	return 0;
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index b10d68f..4149b8a 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -480,7 +480,7 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
 		ea_bdebug(bh, "refcount now=0; freeing");
 		if (ce)
 			mb_cache_entry_free(ce);
-		ext4_free_blocks(handle, inode, bh->b_blocknr, 1);
+		ext4_free_blocks(handle, inode, bh->b_blocknr, 1, 1);
 		get_bh(bh);
 		ext4_forget(handle, 1, inode, bh, bh->b_blocknr);
 	} else {
@@ -822,7 +822,7 @@ inserted:
 			new_bh = sb_getblk(sb, block);
 			if (!new_bh) {
 getblk_failed:
-				ext4_free_blocks(handle, inode, block, 1);
+				ext4_free_blocks(handle, inode, block, 1, 1);
 				error = -EIO;
 				goto cleanup;
 			}
diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
index fbbb920..c2e819f 100644
--- a/include/linux/ext4_fs.h
+++ b/include/linux/ext4_fs.h
@@ -507,6 +507,7 @@ do {									       \
 #define EXT4_MOUNT_USRQUOTA		0x100000 /* "old" user quota */
 #define EXT4_MOUNT_GRPQUOTA		0x200000 /* "old" group quota */
 #define EXT4_MOUNT_EXTENTS		0x400000 /* Extents support */
+#define EXT4_MOUNT_MBALLOC		0x800000 /* Buddy allocation support */
 
 /* Compatibility, for having both ext2_fs.h and ext4_fs.h included at once */
 #ifndef _LINUX_EXT2_FS_H
diff --git a/include/linux/ext4_fs_i.h b/include/linux/ext4_fs_i.h
index 1a511e9..22ba80e 100644
--- a/include/linux/ext4_fs_i.h
+++ b/include/linux/ext4_fs_i.h
@@ -158,6 +158,10 @@ struct ext4_inode_info {
 	 * struct timespec i_{a,c,m}time in the generic inode.
 	 */
 	struct timespec i_crtime;
+
+	/* mballoc */
+	struct list_head i_prealloc_list;
+	spinlock_t i_prealloc_lock;
 };
 
 #endif	/* _LINUX_EXT4_FS_I */
-- 
1.5.3.rc4.67.gf9286-dirty

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] Fixes to make it build and run
  2007-08-13 10:22     ` [PATCH 3/4] This is the equivalent of ext3-mballoc3-sles10.patch Aneesh Kumar K.V
@ 2007-08-13 10:22       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2007-08-13 10:22 UTC (permalink / raw)
  To: alex; +Cc: linux-ext4, Aneesh Kumar K.V

---
 fs/ext4/balloc.c        |   37 +++++++++++++++---
 fs/ext4/extents.c       |    7 +--
 fs/ext4/mballoc.c       |   96 ++++++++++++++--------------------------------
 include/linux/ext4_fs.h |   19 +++++----
 4 files changed, 74 insertions(+), 85 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 55f4be8..012c721 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -626,7 +626,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 			int metadata)
 {
 	struct super_block * sb;
-	int freed;
+	unsigned long dquot_freed_blocks;
 
 	/* this isn't the right place to decide whether block is metadata
 	 * inode.c/extents.c knows better, but for safety ... */
@@ -637,11 +637,13 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 	sb = inode->i_sb;
 
 	if (!test_opt(sb, MBALLOC) || !EXT4_SB(sb)->s_group_info)
-		ext4_free_blocks_sb(handle, sb, block, count, &freed);
+		ext4_free_blocks_sb(handle, sb, block, count,
+						&dquot_freed_blocks);
 	else
-		ext4_mb_free_blocks(handle, inode, block, count, metadata, &freed);
-	if (freed)
-		DQUOT_FREE_BLOCK(inode, freed);
+		ext4_mb_free_blocks(handle, inode, block, count,
+						metadata, &dquot_freed_blocks);
+	if (dquot_freed_blocks)
+		DQUOT_FREE_BLOCK(inode, dquot_freed_blocks);
 	return;
 }
 
@@ -1417,7 +1419,7 @@ int ext4_should_retry_alloc(struct super_block *sb, int *retries)
 }
 
 /**
- * ext4_new_blocks() -- core block(s) allocation function
+ * ext4_new_blocks_old() -- core block(s) allocation function
  * @handle:		handle to this transaction
  * @inode:		file inode
  * @goal:		given target block(filesystem wide)
@@ -1715,9 +1717,32 @@ ext4_fsblk_t ext4_new_block(handle_t *handle, struct inode *inode,
 	return ret;
 }
 
+ext4_fsblk_t ext4_new_blocks(handle_t *handle, struct inode *inode,
+		ext4_fsblk_t goal, unsigned long *count, int *errp)
+{
+	struct ext4_allocation_request ar;
+	ext4_fsblk_t ret;
 
+	if (!test_opt(inode->i_sb, MBALLOC)) {
+		ret = ext4_new_blocks_old(handle, inode, goal, count, errp);
+		return ret;
+	}
+
+	ar.inode = inode;
+	ar.goal = goal;
+	ar.len = *count;
+	ar.logical = 0;
+	ar.lleft = 0;
+	ar.pleft = 0;
+	ar.lright = 0;
+	ar.pright = 0;
+	ar.flags = 0;
+	ret = ext4_mb_new_blocks(handle, &ar, errp);
+	*count = ar.len;
+	return ret;
 }
 
+
 /**
  * ext4_count_free_blocks() -- count filesystem free blocks
  * @sb:		superblock
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 8d163d7..392286f 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1863,7 +1863,7 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
 	unsigned short ee_len =  ext4_ext_get_actual_len(ex);
 	int i, metadata = 0;
 
-	if (S_ISDIR(tree->inode->i_mode) || S_ISLNK(tree->inode->i_mode))
+	if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode))
 		metadata = 1;
 #ifdef EXTENTS_STATS
 	{
@@ -2496,14 +2496,13 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 
 	/* find neighbour allocated blocks */
 	ar.lleft = iblock;
-	err = ext4_ext_search_left(&tree, path, &ar.lleft, &ar.pleft);
+	err = ext4_ext_search_left(inode, path, &ar.lleft, &ar.pleft);
 	if (err)
 		goto out2;
 	ar.lright = iblock;
-	err = ext4_ext_search_right(&tree, path, &ar.lright, &ar.pright);
+	err = ext4_ext_search_right(inode, path, &ar.lright, &ar.pright);
 	if (err)
 		goto out2;
-	/* FIXME!! allocated is updated with resepec to ar.pright */
 
 	/*
 	 * See if request is beyond maximum number of blocks we can have in
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 5fe758e..a28ba0c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -24,7 +24,7 @@
 #include <linux/time.h>
 #include <linux/fs.h>
 #include <linux/namei.h>
-#include <linux/ext4_jbd.h>
+#include <linux/ext4_jbd2.h>
 #include <linux/jbd.h>
 #include <linux/ext4_fs.h>
 #include <linux/quotaops.h>
@@ -303,7 +303,7 @@
  */
 #define MB_DEFAULT_STRIPE		256
 
-static kmem_cache_t *ext4_pspace_cachep = NULL;
+static struct kmem_cache *ext4_pspace_cachep = NULL;
 
 #ifdef EXT4_BB_MAX_BLOCKS
 #undef EXT4_BB_MAX_BLOCKS
@@ -361,7 +361,7 @@ struct ext4_prealloc_space {
 
 struct ext4_free_extent {
 	unsigned long fe_logical;
-	unsigned long fe_start;
+	ext4_grpblk_t fe_start;
 	unsigned long fe_group;
 	unsigned long fe_len;
 };
@@ -458,8 +458,8 @@ static struct proc_dir_entry *proc_root_ext4;
 
 int ext4_create (struct inode *, struct dentry *, int, struct nameidata *);
 struct buffer_head * read_block_bitmap(struct super_block *, unsigned int);
-unsigned long ext4_new_blocks_old(handle_t *handle, struct inode *inode,
-			unsigned long goal, unsigned long *count, int *errp);
+ext4_fsblk_t ext4_new_blocks_old(handle_t *handle, struct inode *inode,
+			ext4_fsblk_t goal, unsigned long *count, int *errp);
 void ext4_mb_release_blocks(struct super_block *, int);
 void ext4_mb_poll_new_transaction(struct super_block *, handle_t *);
 void ext4_mb_free_committed_blocks(struct super_block *);
@@ -473,26 +473,6 @@ void ext4_mb_put_pa(struct ext4_allocation_context *, struct super_block *, stru
 int ext4_mb_init_per_dev_proc(struct super_block *sb);
 int ext4_mb_destroy_per_dev_proc(struct super_block *sb);
 
-/*
- * Calculate the block group number and offset, given a block number
- */
-static void ext4_get_group_no_and_offset(struct super_block *sb,
-					unsigned long blocknr,
-					unsigned long *blockgrpp,
-					unsigned long *offsetp)
-{
-	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
-	unsigned long offset;
-
-	blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
-	offset = blocknr % EXT4_BLOCKS_PER_GROUP(sb);
-	blocknr = blocknr / EXT4_BLOCKS_PER_GROUP(sb);
-	if (offsetp)
-		*offsetp = offset;
-	if (blockgrpp)
-	        *blockgrpp = blocknr;
-
-}
 
 static inline void
 ext4_lock_group(struct super_block *sb, int group)
@@ -1958,13 +1938,13 @@ static int ext4_mb_seq_history_show(struct seq_file *seq, void *v)
 	if (hs->op == EXT4_MB_HISTORY_ALLOC) {
 		fmt = "%-5u %-8u %-23s %-23s %-23s %-5u %-5u %-2u "
 			"%-5u %-5s %-5u %-6u\n";
-		sprintf(buf2, "%lu/%lu/%lu@%lu", hs->result.fe_group,
+		sprintf(buf2, "%lu/%d/%lu@%lu", hs->result.fe_group,
 			hs->result.fe_start, hs->result.fe_len,
 			hs->result.fe_logical);
-		sprintf(buf, "%lu/%lu/%lu@%lu", hs->orig.fe_group,
+		sprintf(buf, "%lu/%d/%lu@%lu", hs->orig.fe_group,
 			hs->orig.fe_start, hs->orig.fe_len,
 			hs->orig.fe_logical);
-		sprintf(buf3, "%lu/%lu/%lu@%lu", hs->goal.fe_group,
+		sprintf(buf3, "%lu/%d/%lu@%lu", hs->goal.fe_group,
 			hs->goal.fe_start, hs->goal.fe_len,
 			hs->goal.fe_logical);
 		seq_printf(seq, fmt, hs->pid, hs->ino, buf, buf3, buf2,
@@ -1973,20 +1953,20 @@ static int ext4_mb_seq_history_show(struct seq_file *seq, void *v)
 				hs->buddy ? 1 << hs->buddy : 0);
 	} else if (hs->op == EXT4_MB_HISTORY_PREALLOC) {
 		fmt = "%-5u %-8u %-23s %-23s %-23s\n";
-		sprintf(buf2, "%lu/%lu/%lu@%lu", hs->result.fe_group,
+		sprintf(buf2, "%lu/%d/%lu@%lu", hs->result.fe_group,
 			hs->result.fe_start, hs->result.fe_len,
 			hs->result.fe_logical);
-		sprintf(buf, "%lu/%lu/%lu@%lu", hs->orig.fe_group,
+		sprintf(buf, "%lu/%d/%lu@%lu", hs->orig.fe_group,
 			hs->orig.fe_start, hs->orig.fe_len,
 			hs->orig.fe_logical);
 		seq_printf(seq, fmt, hs->pid, hs->ino, buf, "", buf2);
 	} else if (hs->op == EXT4_MB_HISTORY_DISCARD) {
-		sprintf(buf2, "%lu/%lu/%lu", hs->result.fe_group,
+		sprintf(buf2, "%lu/%d/%lu", hs->result.fe_group,
 			hs->result.fe_start, hs->result.fe_len);
 		seq_printf(seq, "%-5u %-8u %-23s discard\n",
 				hs->pid, hs->ino, buf2);
 	} else if (hs->op == EXT4_MB_HISTORY_FREE) {
-		sprintf(buf2, "%lu/%lu/%lu", hs->result.fe_group,
+		sprintf(buf2, "%lu/%d/%lu", hs->result.fe_group,
 			hs->result.fe_start, hs->result.fe_len);
 		seq_printf(seq, "%-5u %-8u %-23s free\n",
 				hs->pid, hs->ino, buf2);
@@ -2809,7 +2789,7 @@ int __init init_ext4_proc(void)
 	ext4_pspace_cachep =
 		kmem_cache_create("ext4_prealloc_space",
 				     sizeof(struct ext4_prealloc_space),
-				     0, SLAB_RECLAIM_ACCOUNT, NULL, NULL);
+				     0, SLAB_RECLAIM_ACCOUNT, NULL);
 	if (ext4_pspace_cachep == NULL)
 		return -ENOMEM;
 
@@ -3242,7 +3222,7 @@ void ext4_mb_generate_from_pa(struct super_block *sb, void *bitmap, int group)
 	struct ext4_prealloc_space *pa;
 	struct list_head *cur;
 	unsigned long groupnr;
-	unsigned long start;
+	ext4_grpblk_t start;
 	int preallocated = 0, count = 0, len;
 
  	/* all form of preallocation discards first load group,
@@ -3348,7 +3328,7 @@ int ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 	BUG_ON(ac->ac_status != AC_STATUS_FOUND);
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
-	pa = kmem_cache_alloc(ext4_pspace_cachep, SLAB_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -3433,7 +3413,7 @@ int ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
 	BUG_ON(ext4_pspace_cachep == NULL);
-	pa = kmem_cache_alloc(ext4_pspace_cachep, SLAB_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -3500,7 +3480,8 @@ int ext4_mb_release_inode_pa(struct ext4_buddy *e3b,
 	struct ext4_allocation_context ac;
 	struct super_block *sb = e3b->bd_sb;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
-	unsigned long bit, end, next, group;
+	unsigned long end, next, group;
+	ext4_grpblk_t bit;
 	sector_t start;
 	int err = 0, free = 0;
 
@@ -3554,7 +3535,8 @@ int ext4_mb_release_group_pa(struct ext4_buddy *e3b,
 {
 	struct ext4_allocation_context ac;
 	struct super_block *sb = e3b->bd_sb;
-	unsigned long bit, group;
+	unsigned long group;
+	ext4_grpblk_t bit;
 
 	ac.ac_op = EXT4_MB_HISTORY_DISCARD;
 
@@ -3873,7 +3855,7 @@ int ext4_mb_initialize_context(struct ext4_allocation_context *ac,
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	struct ext4_super_block *es = sbi->s_es;
 	unsigned long group, len, goal;
-	unsigned long block;
+	ext4_grpblk_t block;
 
 	/* we can't allocate > group size */
 	len = ar->len;
@@ -3988,12 +3970,15 @@ unsigned long ext4_mb_new_blocks(handle_t *handle,
 	sbi = EXT4_SB(sb);
 
 	if (!test_opt(sb, MBALLOC)) {
+#if 0
 		static int ext4_mballoc_warning = 0;
 		if (ext4_mballoc_warning++ == 0)
-			printk(KERN_ERR "EXT4-fs: multiblock request with "
+			printk(KERN_ERR "EXT3-fs: multiblock request with "
 					"mballoc disabled!\n");
 		ar->len = 1;
-		block = ext4_new_block_old(handle, ar->inode, ar->goal, errp);
+#endif
+		block = ext4_new_blocks_old(handle, ar->inode, ar->goal,
+								&(ar->len), errp);
 		return block;
 	}
 
@@ -4057,30 +4042,6 @@ out:
 }
 EXPORT_SYMBOL(ext4_mb_new_blocks);
 
-int ext4_new_block(handle_t *handle, struct inode *inode,
-		   unsigned long goal, int *errp)
-{
-	struct ext4_allocation_request ar;
-	unsigned long ret;
-
-	if (!test_opt(inode->i_sb, MBALLOC)) {
-		ret = ext4_new_block_old(handle, inode, goal, errp);
-		return ret;
-	}
-
-	ar.inode = inode;
-	ar.goal = goal;
-	ar.len = 1;
-	ar.logical = 0;
-	ar.lleft = 0;
-	ar.pleft = 0;
-	ar.lright = 0;
-	ar.pright = 0;
-	ar.flags = 0;
-	ret = ext4_mb_new_blocks(handle, &ar, errp);
-	return ret;
-}
-
 void ext4_mb_poll_new_transaction(struct super_block *sb, handle_t *handle)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -4177,14 +4138,15 @@ int ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e3b,
  */
 void ext4_mb_free_blocks(handle_t *handle, struct inode *inode,
 			unsigned long block, unsigned long count,
-			int metadata, int *freed)
+			int metadata, unsigned long *freed)
 {
 	struct buffer_head *bitmap_bh = NULL;
 	struct super_block *sb = inode->i_sb;
 	struct ext4_allocation_context ac;
 	struct ext4_group_desc *gdp;
 	struct ext4_super_block *es;
-	unsigned long bit, overflow;
+	unsigned long overflow;
+	ext4_grpblk_t bit;
 	struct buffer_head *gd_bh;
 	unsigned long block_group;
 	struct ext4_sb_info *sbi;
diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
index c2e819f..8796e6a 100644
--- a/include/linux/ext4_fs.h
+++ b/include/linux/ext4_fs.h
@@ -20,6 +20,8 @@
 #include <linux/blkdev.h>
 #include <linux/magic.h>
 
+#include <linux/ext4_fs_i.h>
+
 /*
  * The second extended filesystem constants/structures
  */
@@ -70,12 +72,12 @@
 
 struct ext4_allocation_request {
 	struct inode *inode;	/* target inode for block we're allocating */
-	unsigned long logical;	/* logical block in target inode */
-	unsigned long goal;	/* phys. target (a hint) */
-	unsigned long lleft;	/* the closest logical allocated block to the left */
-	unsigned long pleft;	/* phys. block for ^^^ */
-	unsigned long lright;	/* the closest logical allocated block to the right */
-	unsigned long pright;	/* phys. block for ^^^ */
+	ext4_fsblk_t logical;	/* logical block in target inode */
+	ext4_fsblk_t goal;	/* phys. target (a hint) */
+	ext4_fsblk_t lleft;	/* the closest logical allocated block to the left */
+	ext4_fsblk_t pleft;	/* phys. block for ^^^ */
+	ext4_fsblk_t lright;	/* the closest logical allocated block to the right */
+	ext4_fsblk_t pright;	/* phys. block for ^^^ */
 	unsigned long len;	/* how many blocks we want to allocate */
 	unsigned long flags;	/* flags. see above EXT4_MB_HINT_* */
 };
@@ -930,7 +932,7 @@ extern ext4_fsblk_t ext4_new_block (handle_t *handle, struct inode *inode,
 extern ext4_fsblk_t ext4_new_blocks (handle_t *handle, struct inode *inode,
 			ext4_fsblk_t goal, unsigned long *count, int *errp);
 extern void ext4_free_blocks (handle_t *handle, struct inode *inode,
-			ext4_fsblk_t block, unsigned long count);
+			ext4_fsblk_t block, unsigned long count, int metadata);
 extern void ext4_free_blocks_sb (handle_t *handle, struct super_block *sb,
 				 ext4_fsblk_t block, unsigned long count,
 				unsigned long *pdquot_freed_blocks);
@@ -980,7 +982,8 @@ extern void ext4_mb_release_blocks(struct super_block *, int);
 extern void ext4_mb_discard_inode_preallocations(struct inode *);
 extern int __init init_ext4_proc(void);
 extern void exit_ext4_proc(void);
-extern void ext4_mb_free_blocks(handle_t *, struct inode *, unsigned long, unsigned long, int, int *);
+extern void ext4_mb_free_blocks(handle_t *, struct inode *,
+		unsigned long, unsigned long, int, unsigned long *);
 
 
 /* inode.c */
-- 
1.5.3.rc4.67.gf9286-dirty

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC] mballoc patches
  2007-08-13 10:22 [RFC] mballoc patches Aneesh Kumar K.V
  2007-08-13 10:22 ` [PATCH 1/4] Add some new function for searching extent tree Aneesh Kumar K.V
@ 2007-08-13 16:38 ` Aneesh Kumar K.V
  2007-08-21 21:00   ` Eric Sandeen
  1 sibling, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2007-08-13 16:38 UTC (permalink / raw)
  To: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]



Aneesh Kumar K.V wrote:
> Alex actually pointed me the new mballoc patches at
> ftp://ftp.clusterfs.com/pub/people/alex/mballoc3
> 
> The series is the forward port of the same on top of
> d4ac2477fad0f2680e84ec12e387ce67682c5c13 (v2.6.23-rc2)
> 
> 
> I guess the mballoc3 patch at clusterfs.com is based on
> a patched ext3(I guess it is ext3 + extent tree local to
> clusterfs ? ). The following series is based on ext4.
> 
> 
> I tested the changes with different blocks per group
> combination and it seems to be working fine. (The last
> series did panic with mke2fs -g 800 )
> 
> 
> I will now look at splitting this to smaller patches.
> 
> NOTE : I am not sure whether patch 2 will be able to make the list.
> if not is there a place i can ftp/scp them so that others can access
> the same ?
> 
>

I am attaching below the PATCH 2/4 in .gz format. The uncompressed one
got dropped by the list.

-aneesh

[-- Attachment #2: 0002-mballoc-core-patch.patch.gz --]
[-- Type: application/x-gzip, Size: 31201 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] mballoc patches
  2007-08-13 16:38 ` [RFC] mballoc patches Aneesh Kumar K.V
@ 2007-08-21 21:00   ` Eric Sandeen
  2007-09-10 12:29     ` Alex Tomas
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2007-08-21 21:00 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-ext4

Aneesh Kumar K.V wrote:
> 
> 
> I am attaching below the PATCH 2/4 in .gz format. The uncompressed one
> got dropped by the list.
> 
> -aneesh

hmm makes it hard to comment in-line though :)  So basing this on the
patch currently in the git repo.

+/*
+ * default stripe size = 1MB
+ */
+#define MB_DEFAULT_STRIPE              256

Units?  Doesn't seem to matter anyway as it's never referenced.

+       /* tunables */
+       unsigned long s_mb_factor;
+       unsigned long s_stripe;
+       unsigned long s_mb_small_req;
+       unsigned long s_mb_large_req;
+       unsigned long s_mb_max_to_scan;
+       unsigned long s_mb_min_to_scan;

could we get some comments here as to what these are, and what units?

Same is true many places... for example

+static int mb_find_extent(struct ext4_buddy *e3b, int order, int block,
+                               int needed, struct ext4_free_extent *ex)

how many "what" are needed?

And perhaps an addition of the new mount options to
Documentation/fs/ext4.txt would be good.

+#define EXT4_MB_BITMAP(e3b)    ((e3b)->bd_bitmap)
+#define EXT4_MB_BUDDY(e3b)     ((e3b)->bd_buddy)

For the sake of consistency should these (and others) be e4b?

Also there are a *lot* of BUGs and BUG_ONs added in this patch... are
none of these recoverable?

Thanks,

-Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] mballoc patches
  2007-08-21 21:00   ` Eric Sandeen
@ 2007-09-10 12:29     ` Alex Tomas
  2007-09-10 14:35       ` Eric Sandeen
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Tomas @ 2007-09-10 12:29 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Aneesh Kumar K.V, linux-ext4

Eric Sandeen wrote:
> +/*
> + * default stripe size = 1MB
> + */
> +#define MB_DEFAULT_STRIPE              256

agree, though seems we'd better make it blocksize-insensitive

> 
> Units?  Doesn't seem to matter anyway as it's never referenced.
> 
> +       /* tunables */
> +       unsigned long s_mb_factor;
> +       unsigned long s_stripe;
> +       unsigned long s_mb_small_req;
> +       unsigned long s_mb_large_req;
> +       unsigned long s_mb_max_to_scan;
> +       unsigned long s_mb_min_to_scan;
> 
> could we get some comments here as to what these are, and what units?

OK, I'll do as well as policy (and tunning) description.

> 
> Same is true many places... for example
> 
> +static int mb_find_extent(struct ext4_buddy *e3b, int order, int block,
> +                               int needed, struct ext4_free_extent *ex)
> 
> how many "what" are needed?

well, blocks :)

> And perhaps an addition of the new mount options to
> Documentation/fs/ext4.txt would be good.
> 
> +#define EXT4_MB_BITMAP(e3b)    ((e3b)->bd_bitmap)
> +#define EXT4_MB_BUDDY(e3b)     ((e3b)->bd_buddy)
> 
> For the sake of consistency should these (and others) be e4b?

OK

> 
> Also there are a *lot* of BUGs and BUG_ONs added in this patch... are
> none of these recoverable?

well, I'll review the code in this regard again, but most of them are not.
not that I like kernel panics, but BUG_ON() are very helpful to maintain
code, especially in long-term.

thanks, Alex

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] mballoc patches
  2007-09-10 12:29     ` Alex Tomas
@ 2007-09-10 14:35       ` Eric Sandeen
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Sandeen @ 2007-09-10 14:35 UTC (permalink / raw)
  To: Alex Tomas; +Cc: Aneesh Kumar K.V, linux-ext4

Alex Tomas wrote:
> Eric Sandeen wrote:
>> +/*
>> + * default stripe size = 1MB
>> + */
>> +#define MB_DEFAULT_STRIPE              256
> 
> agree, though seems we'd better make it blocksize-insensitive

Well, for now this isn't even used at *all* so might as well just remove it.

>> +static int mb_find_extent(struct ext4_buddy *e3b, int order, int block,
>> +                               int needed, struct ext4_free_extent *ex)
>>
>> how many "what" are needed?
> 
> well, blocks :)

I figured, though bytes is a possibility.  :)  In general, being
explicit about units helps; xfs suffers a bit from byte/block/sector
confusion in the code, in some places.

>> Also there are a *lot* of BUGs and BUG_ONs added in this patch... are
>> none of these recoverable?
> 
> well, I'll review the code in this regard again, but most of them are not.
> not that I like kernel panics, but BUG_ON() are very helpful to maintain
> code, especially in long-term.

Ok, I just think there might be upstream pushback on these, at one point
at least they were trying to reduce the nr. of BUG calls in the code...

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-09-10 14:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-13 10:22 [RFC] mballoc patches Aneesh Kumar K.V
2007-08-13 10:22 ` [PATCH 1/4] Add some new function for searching extent tree Aneesh Kumar K.V
     [not found]   ` <1187000553923-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
2007-08-13 10:22     ` [PATCH 3/4] This is the equivalent of ext3-mballoc3-sles10.patch Aneesh Kumar K.V
2007-08-13 10:22       ` [PATCH 4/4] Fixes to make it build and run Aneesh Kumar K.V
2007-08-13 16:38 ` [RFC] mballoc patches Aneesh Kumar K.V
2007-08-21 21:00   ` Eric Sandeen
2007-09-10 12:29     ` Alex Tomas
2007-09-10 14:35       ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).