* Ext4 devel interlock meeting minutes (October 1, 2007) @ 2007-10-01 22:10 Avantika Mathur 2007-10-02 5:58 ` Andreas Dilger 2007-10-02 9:24 ` Valerie Clement 0 siblings, 2 replies; 7+ messages in thread From: Avantika Mathur @ 2007-10-01 22:10 UTC (permalink / raw) To: linux-ext4 Ext4 Developer Interlock Call: 10/1/07 Meeting Minutes Attendees: Mingming Cao, Aneesh Veetil, Dave Kleikamp, Akira Fujita, Ted Ts'o, Jose Santos, Avantika Mathur, Jean-Pierre Dion Minutes can be accessed at: http://ext4.wiki.kernel.org/index.php/Ext4_Developer%27s_Conference_Call EXT4 PATCH QUEUE Multiple Block Allocation: - Main outstanding issue is still need for better documentation of the code. - Aneesh has marked the places that need explanation with fixme, and Alex will be adding the needed comments. If this gets done quickly, we can try to push this feature to 2.6.24 - Aneesh has asked for recommendations for a benchmark to run which will test the mballoc allocator. Mingming suggested using dd and dbench, as was done in the past for basic mballoc patches. To test the in-core preallocation, kernel untar tests. Look at the file fragmentation after running the test. Large Block Patches: - Mingming will post these patches to lkml for review Uninitialized block groups: - the uninitialized block group patches were posted to lkml and are in the -mm tree. JBD-stats through procfs patches: - Ted will look at these patches and see if they are ready to push upstream generic-find-next-le-bit: - This function is only used in multiple block allocation patches. It would not be a good idea to post this before mballoc, since the function would be unused and may be cleaned up later. Delayed Allocation: - There has been an lkml thread about these patches. - We have and approach that works for ext4, implemented at the vfs level, but unless we can prove it can work for other filesystems, it will not be accepted. - Christoph Hellwig has commented that these patches will not work for XFS. - Mingming will try the patches with ext2/3, and Shaggy will try to find time to test with JFS. - If this approach works for other filesystems, we can try to push to mainline as a placeholder. truncate mutex patch: - There is a performance concern with these patches. Mingming suggested doing some testing to see what the actual performance regression is. - Ted suggests breaking into two patches: -- Patch 1 - Change the name of the mutex to block_mapped_mutex and update comments to reflect the usage of the mutex -- Patch 2 - Change to a read/write mutex GENERAL - Those patches we would still like to see in 2.6.24 need to be pushed very quickly - We should avoid having ext2/3 patches in the ext4-patch-queue and git tree. - Some of the patches that we just send to lkml for review end up being pulled into the -mm tree, then Ted pulls them to the ext4-git tree, and Andrew must remove from -mm. When we post for comments, we should use [PATCH,RFC] to avoid this. - Mingming suggested having a patch queue for the ext4-git tree for easier review. Ted is also considering having a pu branch for the git tree. Ted will think about what will work best. - In the ext4-patch qDescription of patch include which version of -mm the patch is in ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-01 22:10 Ext4 devel interlock meeting minutes (October 1, 2007) Avantika Mathur @ 2007-10-02 5:58 ` Andreas Dilger 2007-10-02 14:23 ` Eric Sandeen 2007-10-02 9:24 ` Valerie Clement 1 sibling, 1 reply; 7+ messages in thread From: Andreas Dilger @ 2007-10-02 5:58 UTC (permalink / raw) To: Avantika Mathur; +Cc: linux-ext4 On Oct 01, 2007 15:10 -0700, Avantika Mathur wrote: > Delayed Allocation: > - There has been an lkml thread about these patches. > - We have and approach that works for ext4, implemented at the vfs > level, but unless we can prove it can work for other filesystems, it > will not be accepted. > - Christoph Hellwig has commented that these patches will not work for > XFS. Hmm, but I thought that Christoph also agreed that it would be OK to get the ext4 delalloc code merged separately, so long as it doesn't need big/any changes to the VFS to implement it. It might be that the ext4 and XFS code is different enough that they cannot share the delayed allocation code. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-02 5:58 ` Andreas Dilger @ 2007-10-02 14:23 ` Eric Sandeen 0 siblings, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2007-10-02 14:23 UTC (permalink / raw) To: Andreas Dilger; +Cc: Avantika Mathur, linux-ext4 Andreas Dilger wrote: > On Oct 01, 2007 15:10 -0700, Avantika Mathur wrote: >> Delayed Allocation: >> - There has been an lkml thread about these patches. >> - We have and approach that works for ext4, implemented at the vfs >> level, but unless we can prove it can work for other filesystems, it >> will not be accepted. >> - Christoph Hellwig has commented that these patches will not work for >> XFS. > > Hmm, but I thought that Christoph also agreed that it would be OK to > get the ext4 delalloc code merged separately, so long as it doesn't > need big/any changes to the VFS to implement it. It might be that the > ext4 and XFS code is different enough that they cannot share the > delayed allocation code. IIRC there is only one change to the VFS, a test for buffer_delay() in __block_write_full_page(). XFS is the only current caller of set_buffer_delay, and XFS does not even use the block_write_full_page path. The rest was a huge chunk dropped into mpage.c, but not interfering with anything else. So, I don't see any conflict. -Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-01 22:10 Ext4 devel interlock meeting minutes (October 1, 2007) Avantika Mathur 2007-10-02 5:58 ` Andreas Dilger @ 2007-10-02 9:24 ` Valerie Clement 2007-10-02 11:36 ` Aneesh Kumar K.V 2007-10-02 14:12 ` Andreas Dilger 1 sibling, 2 replies; 7+ messages in thread From: Valerie Clement @ 2007-10-02 9:24 UTC (permalink / raw) To: Avantika Mathur; +Cc: linux-ext4 Avantika Mathur wrote: > Multiple Block Allocation: > - Main outstanding issue is still need for better documentation of the > code. > - Aneesh has marked the places that need explanation with fixme, and > Alex will be adding the needed comments. If this gets done quickly, we > can try to push this feature to 2.6.24 > - Aneesh has asked for recommendations for a benchmark to run which will > test the mballoc allocator. Mingming suggested using dd and dbench, as > was done in the past for basic mballoc patches. To test the in-core > preallocation, kernel untar tests. Look at the file fragmentation after > running the test. Currently, the mballoc feature is not compatible with the uninit_groups feature. I have just tried a simple test which failed. Isn't this a problem? Another point that is not clear for me, is mballoc compatible with delalloc now, or is there still a restriction? Valérie ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-02 9:24 ` Valerie Clement @ 2007-10-02 11:36 ` Aneesh Kumar K.V 2007-10-02 14:12 ` Andreas Dilger 1 sibling, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2007-10-02 11:36 UTC (permalink / raw) To: Valerie Clement; +Cc: Avantika Mathur, linux-ext4 Valerie Clement wrote: > Avantika Mathur wrote: >> Multiple Block Allocation: >> - Main outstanding issue is still need for better documentation of the >> code. >> - Aneesh has marked the places that need explanation with fixme, and >> Alex will be adding the needed comments. If this gets done quickly, >> we can try to push this feature to 2.6.24 >> - Aneesh has asked for recommendations for a benchmark to run which >> will test the mballoc allocator. Mingming suggested using dd and >> dbench, as was done in the past for basic mballoc patches. To test >> the in-core preallocation, kernel untar tests. Look at the file >> fragmentation after running the test. > > Currently, the mballoc feature is not compatible with the uninit_groups > feature. I have just tried a simple test which failed. Isn't this a > problem? > > Another point that is not clear for me, is mballoc compatible with > delalloc now, or is there still a restriction? > I thought i had merged the patch from Andreas. http://marc.info/?l=linux-ext4&m=118833298930860&w=2 But i don't see it in mballoc-core.patch Will update the mballoc patch -aneesh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-02 9:24 ` Valerie Clement 2007-10-02 11:36 ` Aneesh Kumar K.V @ 2007-10-02 14:12 ` Andreas Dilger 2007-10-02 18:10 ` Aneesh Kumar K.V 1 sibling, 1 reply; 7+ messages in thread From: Andreas Dilger @ 2007-10-02 14:12 UTC (permalink / raw) To: Valerie Clement; +Cc: Avantika Mathur, linux-ext4 On Oct 02, 2007 11:24 +0200, Valerie Clement wrote: > Currently, the mballoc feature is not compatible with the uninit_groups > feature. I have just tried a simple test which failed. Isn't this a problem? I thought Avantika posted the incremental patch to ext4 mballoc to work with the uninit groups patch? This is what we have in our ext3 patch: Index: linux-rhel5/fs/ext3/mballoc.c =================================================================== --- linux-rhel5.orig/fs/ext3/mballoc.c 2007-07-18 17:32:04.000000000 +0200 +++ linux-rhel5/fs/ext3/mballoc.c 2007-07-18 17:32:15.000000000 +0200 @@ -36,6 +36,8 @@ #include <linux/seq_file.h> #include <linux/version.h> +#include "group.h" + /* * MUSTDO: * - test ext3_ext_search_left() and ext3_ext_search_right() @@ -323,6 +325,7 @@ struct ext3_group_info { unsigned long bb_state; unsigned long bb_tid; struct ext3_free_metadata *bb_md_cur; + struct ext3_group_desc *bb_gdp; unsigned short bb_first_free; unsigned short bb_free; unsigned short bb_fragments; @@ -943,10 +946,7 @@ static int ext3_mb_init_cache(struct pag if (first_group + i >= EXT3_SB(sb)->s_groups_count) break; - err = -EIO; - desc = ext3_get_group_desc(sb, first_group + i, NULL); - if (desc == NULL) - goto out; + desc = EXT3_GROUP_INFO(sb, first_group + i)->bb_gdp; err = -ENOMEM; bh[i] = sb_getblk(sb, le32_to_cpu(desc->bg_block_bitmap)); @@ -961,7 +961,12 @@ static int ext3_mb_init_cache(struct pag unlock_buffer(bh[i]); continue; } - + if (desc->bg_flags & cpu_to_le16(EXT3_BG_BLOCK_UNINIT)) { + ext3_init_block_bitmap(sb, bh[i], first_group + i,desc); + set_buffer_uptodate(bh[i]); + unlock_buffer(bh[i]); + continue; + } get_bh(bh[i]); bh[i]->b_end_io = end_buffer_read_sync; submit_bh(READ, bh[i]); @@ -1732,6 +1737,10 @@ static int ext3_mb_good_group(struct ext switch (cr) { case 0: BUG_ON(ac->ac_2order == 0); + /* If this group is uninitialized, skip it initially */ + if (grp->bb_gdp->bg_flags & + cpu_to_le16(EXT3_BG_BLOCK_UNINIT)) + return 0; bits = ac->ac_sb->s_blocksize_bits + 1; for (i = ac->ac_2order; i <= bits; i++) if (grp->bb_counters[i] > 0) @@ -1825,7 +1834,9 @@ repeat: } ac->ac_groups_scanned++; - if (cr == 0) + if (cr == 0 || (e3b.bd_info->bb_gdp->bg_flags & + cpu_to_le16(EXT3_BG_BLOCK_UNINIT) && + ac->ac_2order != 0)) ext3_mb_simple_scan_group(ac, &e3b); else if (cr == 1 && ac->ac_g_ex.fe_len == sbi->s_stripe) ext3_mb_scan_aligned(ac, &e3b); @@ -2304,12 +2315,13 @@ int ext3_mb_init_backend(struct super_bl i--; goto err_freebuddy; } + memset(meta_group_info[j], 0, len); desc = ext3_get_group_desc(sb, i, NULL); + meta_group_info[j]->bb_gdp = desc; if (desc == NULL) { printk(KERN_ERR"EXT3-fs: can't read descriptor %u\n",i); goto err_freebuddy; } - memset(meta_group_info[j], 0, len); set_bit(EXT3_GROUP_INFO_NEED_INIT_BIT, &meta_group_info[j]->bb_state); @@ -2958,9 +2970,17 @@ int ext3_mb_mark_diskspace_used(struct e mb_set_bits(bitmap_bh->b_data, ac->ac_b_ex.fe_start, ac->ac_b_ex.fe_len); spin_lock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group)); + if (gdp->bg_flags & cpu_to_le16(EXT3_BG_BLOCK_UNINIT)) { + gdp->bg_flags &= cpu_to_le16(~EXT3_BG_BLOCK_UNINIT); + gdp->bg_free_blocks_count = + cpu_to_le16(ext3_free_blocks_after_init(sb, + ac->ac_b_ex.fe_group, + gdp)); + } gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) - ac->ac_b_ex.fe_len); + gdp->bg_checksum = ext3_group_desc_csum(sbi, ac->ac_b_ex.fe_group, gdp); spin_unlock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group)); percpu_counter_mod(&sbi->s_freeblocks_counter, - ac->ac_b_ex.fe_len); @@ -4346,6 +4366,7 @@ do_more: spin_lock(sb_bgl_lock(sbi, block_group)); gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) + count); + gdp->bg_checksum = ext3_group_desc_csum(sbi, block_group, gdp); spin_unlock(sb_bgl_lock(sbi, block_group)); percpu_counter_mod(&sbi->s_freeblocks_counter, count); Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ext4 devel interlock meeting minutes (October 1, 2007) 2007-10-02 14:12 ` Andreas Dilger @ 2007-10-02 18:10 ` Aneesh Kumar K.V 0 siblings, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2007-10-02 18:10 UTC (permalink / raw) To: Andreas Dilger; +Cc: Valerie Clement, Avantika Mathur, linux-ext4 Andreas Dilger wrote: > On Oct 02, 2007 11:24 +0200, Valerie Clement wrote: >> Currently, the mballoc feature is not compatible with the uninit_groups >> feature. I have just tried a simple test which failed. Isn't this a problem? > > I thought Avantika posted the incremental patch to ext4 mballoc to work > with the uninit groups patch? This is what we have in our ext3 patch: > > > This is what i have right now. The balloc.c and group.h changes should actually go in uninitialized block group patch. This patch is slightly different from the one posted by Andreas. Here instead of adding ext4_block_group descriptor again to ext4_group_info i am explicitly calling ext4_get_group_desc. -aneesh diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 321ad1e..a8aebf2 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -106,6 +106,16 @@ unsigned ext4_init_block_bitmap(struct super_block *sb, struct buffer_head *bh, for (bit = (ext4_inode_table(sb, gdp) - start), bit_max = bit + sbi->s_itb_per_group; bit < bit_max; bit++) ext4_set_bit(bit, bh->b_data); + + /* + * Also if the number of blocks within the group is less than the + * blocksize * 8 ( which is the size of bitmap ), set rest of the + * block bitmap to 1 + */ + for (bit = EXT4_BLOCKS_PER_GROUP(sb); + bit < sb->s_blocksize * 8; bit++) { + ext4_set_bit(bit, bh->b_data); + } } return free_blocks - sbi->s_itb_per_group - 2; diff --git a/fs/ext4/group.h b/fs/ext4/group.h index 9310979..5165311 100644 --- a/fs/ext4/group.h +++ b/fs/ext4/group.h @@ -8,9 +8,6 @@ #ifndef _LINUX_EXT4_GROUP_H #define _LINUX_EXT4_GROUP_H -#if defined(CONFIG_CRC16) -#include <linux/crc16.h> -#endif extern __le16 ext4_group_desc_csum(struct ext4_sb_info *sbi, __u32 group, struct ext4_group_desc *gdp); diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 5ffc80b..28ad4fc 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -34,6 +34,7 @@ #include <linux/pagemap.h> #include <linux/seq_file.h> #include <linux/version.h> +#include "group.h" /* * MUSTDO: @@ -893,6 +894,14 @@ static int ext4_mb_init_cache(struct page *page, char *incore) continue; } + if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) { + + ext4_init_block_bitmap(sb, bh[i], + first_group + i, desc); + set_buffer_uptodate(bh[i]); + unlock_buffer(bh[i]); + continue; + } get_bh(bh[i]); bh[i]->b_end_io = end_buffer_read_sync; submit_bh(READ, bh[i]); @@ -1702,11 +1711,10 @@ static void ext4_mb_scan_aligned(struct ext4_allocation_context *ac, static int ext4_mb_good_group(struct ext4_allocation_context *ac, int group, int cr) { + unsigned free, fragments; + unsigned i, bits; + struct ext4_group_desc *desc; struct ext4_group_info *grp = EXT4_GROUP_INFO(ac->ac_sb, group); - unsigned free; - unsigned fragments; - unsigned i; - unsigned bits; BUG_ON(cr < 0 || cr >= 4); BUG_ON(EXT4_MB_GRP_NEED_INIT(grp)); @@ -1721,6 +1729,11 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac, switch (cr) { case 0: BUG_ON(ac->ac_2order == 0); + /* If this group is uninitialized, skip it initially */ + desc = ext4_get_group_desc(ac->ac_sb, group, NULL); + if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) + return 0; + bits = ac->ac_sb->s_blocksize_bits + 1; for (i = ac->ac_2order; i <= bits; i++) if (grp->bb_counters[i] > 0) @@ -1805,6 +1818,7 @@ repeat: ac->ac_criteria = cr; for (i = 0; i < EXT4_SB(sb)->s_groups_count; group++, i++) { struct ext4_group_info *grp; + struct ext4_group_desc *desc; if (group == EXT4_SB(sb)->s_groups_count) group = 0; @@ -1844,12 +1858,16 @@ repeat: } ac->ac_groups_scanned++; - if (cr == 0) + desc = ext4_get_group_desc(sb, group, NULL); + if (cr == 0 || (desc->bg_flags & + cpu_to_le16(EXT4_BG_BLOCK_UNINIT) && + ac->ac_2order != 0)) { ext4_mb_simple_scan_group(ac, &e4b); - else if (cr == 1 && ac->ac_g_ex.fe_len == sbi->s_stripe) + } else if (cr == 1 && ac->ac_g_ex.fe_len == sbi->s_stripe) { ext4_mb_scan_aligned(ac, &e4b); - else + } else { ext4_mb_complex_scan_group(ac, &e4b); + } ext4_unlock_group(sb, group); ext4_mb_release_desc(&e4b); @@ -2267,11 +2285,8 @@ static void ext4_mb_store_history(struct ext4_allocation_context *ac) static int ext4_mb_init_backend(struct super_block *sb) { + int i, j, len, metalen; struct ext4_sb_info *sbi = EXT4_SB(sb); - int i; - int j; - int len; - int metalen; int num_meta_group_infos = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) >> EXT4_DESC_PER_BLOCK_BITS(sb); @@ -2321,7 +2336,7 @@ static int ext4_mb_init_backend(struct super_block *sb) sbi->s_group_info[i >> EXT4_DESC_PER_BLOCK_BITS(sb)]; j = i & (EXT4_DESC_PER_BLOCK(sb) - 1); - meta_group_info[j] = kmalloc(len, GFP_KERNEL); + meta_group_info[j] = kzalloc(len, GFP_KERNEL); if (meta_group_info[j] == NULL) { printk(KERN_ERR "EXT4-fs: can't allocate buddy mem\n"); i--; @@ -2333,7 +2348,6 @@ static int ext4_mb_init_backend(struct super_block *sb) "EXT4-fs: can't read descriptor %u\n", i); goto err_freebuddy; } - memset(meta_group_info[j], 0, len); set_bit(EXT4_GROUP_INFO_NEED_INIT_BIT, &meta_group_info[j]->bb_state); @@ -2919,9 +2933,17 @@ static int ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, ac->ac_b_ex.fe_len); spin_lock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group)); + if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) { + gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT); + gdp->bg_free_blocks_count = + cpu_to_le16(ext4_free_blocks_after_init(sb, + ac->ac_b_ex.fe_group, + gdp)); + } gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) - ac->ac_b_ex.fe_len); + gdp->bg_checksum = ext4_group_desc_csum(sbi, ac->ac_b_ex.fe_group, gdp); spin_unlock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group)); percpu_counter_mod(&sbi->s_freeblocks_counter, - ac->ac_b_ex.fe_len); @@ -4066,7 +4088,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, #if 0 static int ext4_mballoc_warning = 0; if (ext4_mballoc_warning++ == 0) - printk(KERN_ERR "EXT3-fs: multiblock request with " + printk(KERN_ERR "EXT4-fs: multiblock request with " "mballoc disabled!\n"); ar->len = 1; #endif @@ -4353,6 +4375,7 @@ do_more: spin_lock(sb_bgl_lock(sbi, block_group)); gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) + count); + gdp->bg_checksum = ext4_group_desc_csum(sbi, block_group, gdp); spin_unlock(sb_bgl_lock(sbi, block_group)); percpu_counter_mod(&sbi->s_freeblocks_counter, count); ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-10-02 18:10 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-10-01 22:10 Ext4 devel interlock meeting minutes (October 1, 2007) Avantika Mathur 2007-10-02 5:58 ` Andreas Dilger 2007-10-02 14:23 ` Eric Sandeen 2007-10-02 9:24 ` Valerie Clement 2007-10-02 11:36 ` Aneesh Kumar K.V 2007-10-02 14:12 ` Andreas Dilger 2007-10-02 18:10 ` Aneesh Kumar K.V
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).