* Ext4 devel interlock meeting minutes (April 23, 2007)
@ 2007-04-23 23:35 Avantika Mathur
2007-04-24 6:00 ` Alex Tomas
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Avantika Mathur @ 2007-04-23 23:35 UTC (permalink / raw)
To: linux-ext4
Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
Attendees: Mingming Cao, Dave Kleikamp, Avantika Mathur, Ted Ts'o,
Suparna Bhattacharya,
Jean-Pierre Dion, Jean Noel Cordenner, Valérie Clément, Jose Santos
Minutes can be accessed at:
http://ext4.wiki.kernel.org/index.php/Ext4_Developer%27s_Conference_Call
- Mingming proposed moving back to 8am PST meeting time, since the 6am
time is inconvenient for
a few people. This discussion will be continued through email, to find
a time which works
for everyone.
- Next week's meeting will be canceled, unless there is anyone who would
like to request a meeting.
PATCH STATUS
git-tree
- Mingming will be updating the git tree with extents-fix patches from
Alex, i_flags patch from Honza, i_extra_isize patch from Kalpak.
Uninitialized Block Groups:
- The patch sent out by Andreas is against 2.6.16 and ext3. Need to
port this to current ext4, test and then add to git-tree. Avantika will
ask Andreas if he needs help with this.
JBD statistics:
- There is a patch to export JDB statistics to /proc. In order to get
this patch to mainline, there needs to be discussion about the correct
place for the statistics; /proc or perhaps debugfs.
e2fsprogs:
- Ted will post the current e2fsprogs patches in progress. Ted has been
working with these patches and making changes.
- Main work areas for making e2fsprogs compatible with extents and 64-bit.
- block iterator: make a block iterator work with both extent and
non-extent code. Code that is oblivious to extents will still work with
the block iterator. This has been written by Andreas Dilger.
- extents: in order to preserve ABI compatibility, support for a new
interface for extents which uses 64-bit logical and physical block
numbers. The block iterator then translate from on-disk to in-memory
format. This will allow for possible future increases of physical and
logical block sizes in extents, without breaking ABI.
- bitmaps in e2fsprogs: this will be discussed in more detail at the
next meeting, after people have a chance to read related email.
preallocation:
- fallocate syscall interface: the current plan, based on discussions
on the mailing list, is to create a separate wrapper for s390 in glibc.
Using regular parameter ordering for all other architectures, but a
different order on s390. Jakub Jelinek has said that the changes in
glibc can be made pretty easily.
- The preallocation patches in the ext4 git-tree are outdated, using
the ioctl interface. Once Amit re-posts the patches with the syscall
interface, they will be updated in the git-tree as well.
- Mingming mentioned the need to flush preallocation metadata changes to
disk if file size or file content is being tested. Discussed doing an
fsync at Bmap time.
TESTING
- extents testing
- Discussed methods for testing extents on highly fragmented
filesystems.
- Jose will look into possible tests, including perhaps using the
'aged' option in FFSB
- Ted suggested creating a mountoption that creates a bad block
allocator which it jumps to a new block group every 8 blocks. This
would force a very large number of extents, and may be a good test for
extents.
- large filesystem
- We would like to perform more testing on large (>16TB) filesystems
- currently hardware limitations are preventing this testing. We
have tested 10TB raid dists, and 16TB loopback devices. Avantika will
look into creating very large sparse devices for testing.
- Large file deletion
- Valerie had recently tested large file deletion on ext3/4, but did
not see the expected performance gain with ext4 due to compact metadata
when using extents.
- Valerie will try re-running the test. Jose will also be looking
into this test.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur @ 2007-04-24 6:00 ` Alex Tomas 2007-04-24 14:04 ` Valerie Clement 2007-04-24 14:27 ` Eric Sandeen 2007-04-30 11:06 ` Aneesh Kumar 2 siblings, 1 reply; 9+ messages in thread From: Alex Tomas @ 2007-04-24 6:00 UTC (permalink / raw) To: Avantika Mathur; +Cc: linux-ext4 Avantika Mathur wrote: > TESTING > - extents testing > - Discussed methods for testing extents on highly fragmented > filesystems. > - Jose will look into possible tests, including perhaps using the > 'aged' option in FFSB > - Ted suggested creating a mountoption that creates a bad block > allocator which it jumps to a new block group every 8 blocks. This > would force a very large number of extents, and may be a good test for > extents. there is AGGRESSIVE_TEST define which limits number of entries in index/leaf. > - Large file deletion > - Valerie had recently tested large file deletion on ext3/4, but did > not see the expected performance gain with ext4 due to compact metadata > when using extents. any details? thanks, Alex ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-24 6:00 ` Alex Tomas @ 2007-04-24 14:04 ` Valerie Clement 2007-04-24 14:21 ` Alex Tomas 0 siblings, 1 reply; 9+ messages in thread From: Valerie Clement @ 2007-04-24 14:04 UTC (permalink / raw) To: Alex Tomas; +Cc: Avantika Mathur, linux-ext4, Mingming Cao Alex Tomas wrote: >> - Large file deletion >> - Valerie had recently tested large file deletion on ext3/4, but >> did not see the expected performance gain with ext4 due to compact >> metadata when using extents. > > any details? > Ok, I found my mistake. There was a typo in my test script and the pagecache was not flushed between the file creation and the deletion. Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB file: ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s xfs : real 0m0.377s user 0m0.004s sys 0m0.004s The performance gain with ext4 is much larger when running a good test... Sorry the wrong information, Valérie ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-24 14:04 ` Valerie Clement @ 2007-04-24 14:21 ` Alex Tomas 2007-04-24 14:51 ` Valerie Clement 0 siblings, 1 reply; 9+ messages in thread From: Alex Tomas @ 2007-04-24 14:21 UTC (permalink / raw) To: Valerie Clement; +Cc: Avantika Mathur, linux-ext4, Mingming Cao Valerie Clement wrote: > Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB > file: > > ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s > ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s > xfs : real 0m0.377s user 0m0.004s sys 0m0.004s would be very interesting to know how much IO was done to remove the file and actual fragmentation in all the cases. thanks, Alex ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-24 14:21 ` Alex Tomas @ 2007-04-24 14:51 ` Valerie Clement 0 siblings, 0 replies; 9+ messages in thread From: Valerie Clement @ 2007-04-24 14:51 UTC (permalink / raw) To: Alex Tomas; +Cc: Avantika Mathur, linux-ext4, Mingming Cao Alex Tomas wrote: > Valerie Clement wrote: >> Here are the results I obtain with a 2.6.17-rc7 kernel to delete a >> 100GB file: >> >> ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s >> ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s >> xfs : real 0m0.377s user 0m0.004s sys 0m0.004s > > would be very interesting to know how much IO was done to remove the file > and actual fragmentation in all the cases. > > thanks, Alex > Ok, I will do it. Valérie ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur 2007-04-24 6:00 ` Alex Tomas @ 2007-04-24 14:27 ` Eric Sandeen 2007-04-30 11:06 ` Aneesh Kumar 2 siblings, 0 replies; 9+ messages in thread From: Eric Sandeen @ 2007-04-24 14:27 UTC (permalink / raw) To: Avantika Mathur; +Cc: linux-ext4 Avantika Mathur wrote: > - large filesystem > - We would like to perform more testing on large (>16TB) filesystems > - currently hardware limitations are preventing this testing. We > have tested 10TB raid dists, and 16TB loopback devices. Avantika will > look into creating very large sparse devices for testing. I've been hacking up some ext3@16T testing scripts to use sparse devicemapper devices which make use of snapshots... loopback files don't work for testing, at least not hosted on ext[234], because we still can't do these large file offsets. (Documentation/device-mapper/zero.txt in the kernel tree describes these sparse dm devices) Testing the whole range as a sparse snapshot can be slow, since devicemapper has to do all the exception handling etc, and I think essentially creates a fragmented block device. I've been playing with something like this: # 90% of the real device size is used for a "real" 1:1 mapping # The other 10% is sparsely mapped out to add up to totalsize. # i.e. - # [large sparse-ish device] # # +----------------------~ ~-----------------------------------------+ # | sparse | real | # +----------------------~ ~-----------------------------------------+ # # |<------------ SPARSE_SIZE ---------------->|<----- REAL_SIZE ----->| # is mapped on top of: # [real block device] # +----------------------------+ # | sp | real | # +----------------------------+ and then marking the sparse range as full (maybe via lazy_bg, or other methods). You could then also put a dm-error target under the "full" sections so that any IO that may stray there will fail. This way you can direct the real IO to the 1:1 mapping portion of the large dm device, and shouldn't get the snapshot slowdowns. Anyway, just something I've been playing with... -eric ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur 2007-04-24 6:00 ` Alex Tomas 2007-04-24 14:27 ` Eric Sandeen @ 2007-04-30 11:06 ` Aneesh Kumar 2007-04-30 11:13 ` Alex Tomas 2007-05-01 12:08 ` Kalpak Shah 2 siblings, 2 replies; 9+ messages in thread From: Aneesh Kumar @ 2007-04-30 11:06 UTC (permalink / raw) To: Avantika Mathur; +Cc: linux-ext4 On 4/24/07, Avantika Mathur <mathur@linux.vnet.ibm.com> wrote: > Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes > > TESTING > - extents testing > - Discussed methods for testing extents on highly fragmented > filesystems. > - Jose will look into possible tests, including perhaps using the > 'aged' option in FFSB > - Ted suggested creating a mountoption that creates a bad block > allocator which it jumps to a new block group every 8 blocks. This > would force a very large number of extents, and may be a good test for > extents. What i am doing for creating a large number of extents is dd if=/dev/zero of=myfile count=10 seek=20 while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek; seek=`expr $seek + 20`; done -aneesh ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-30 11:06 ` Aneesh Kumar @ 2007-04-30 11:13 ` Alex Tomas 2007-05-01 12:08 ` Kalpak Shah 1 sibling, 0 replies; 9+ messages in thread From: Alex Tomas @ 2007-04-30 11:13 UTC (permalink / raw) To: Aneesh Kumar; +Cc: Avantika Mathur, linux-ext4 Aneesh Kumar wrote: > What i am doing for creating a large number of extents is > > dd if=/dev/zero of=myfile count=10 > seek=20 > while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek; > seek=`expr $seek + 20`; done with AGGRESSIVE_TEST defined in include/linux/ext4_fs_extents.h you may get much more extents and index blocks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007) 2007-04-30 11:06 ` Aneesh Kumar 2007-04-30 11:13 ` Alex Tomas @ 2007-05-01 12:08 ` Kalpak Shah 1 sibling, 0 replies; 9+ messages in thread From: Kalpak Shah @ 2007-05-01 12:08 UTC (permalink / raw) To: Aneesh Kumar; +Cc: Avantika Mathur, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 1662 bytes --] On Mon, 2007-04-30 at 16:36 +0530, Aneesh Kumar wrote: > On 4/24/07, Avantika Mathur <mathur@linux.vnet.ibm.com> wrote: > > Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes > > > > TESTING > > - extents testing > > - Discussed methods for testing extents on highly fragmented > > filesystems. > > - Jose will look into possible tests, including perhaps using the > > 'aged' option in FFSB > > - Ted suggested creating a mountoption that creates a bad block > > allocator which it jumps to a new block group every 8 blocks. This > > would force a very large number of extents, and may be a good test for > > extents. > > > What i am doing for creating a large number of extents is > > dd if=/dev/zero of=myfile count=10 > seek=20 > while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek; > seek=`expr $seek + 20`; done > > I had written a simple tool "bitmap_manip" with which you can actually manipulate the number of free chunks and their sizes in a filesystem. It uses libext2fs to set the bits in block bitmaps thereby leaving the desired free extents. I had written it to test the allocators performance. It can be used as: ./bitmap_manip /dev/sda9 1MA 4 16K 1 12K 3 8K 4 4K 6 This will leave only 1 16K chunk, 3 12K chunks, .... free in the filesystem. "1MA" 4 will get us 4 1Mb free ALIGNED chunks. It isn't very beautiful code since it was only used for testing but maybe it can help. Thanks, Kalpak. > -aneesh > - > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: bitmap_manip.c --] [-- Type: text/x-csrc, Size: 5074 bytes --] /* Manipulate block bitmap directly for mballoc testing */ /* USAGE: * ./bitmap_manip /dev/volmballoc/test 16K 1 12K 3 8K 4 4K 6 * This will leave 1 16K chunk, 3 12K chunks, .... in the filesystem specified. * Ideally give the inputs in ascending order. * 1MA 4 will get us 4 1Mb ALIGNED chunks. */ #include <stdio.h> #include <ext2fs/ext2fs.h> #include <ext2fs/ext2_types.h> #include <fcntl.h> #include <stdlib.h> #define ONE_MB (1024 * 1024) #define ONE_KB 1024 #define SETTING 0 #define FREEING 1 #define NO_ALIGN 0 #define ALIGN 1 struct chunk_arg { int chunk_size; int num_chunks; int align; }; int main(int argc, char **argv) { ext2_filsys fs; ext2fs_block_bitmap *map = NULL; int bg_num = 0, retval, arg_num, multiply, chunk_num; int i, start_blk, set_bit, test_bit, j; struct chunk_arg chunk[50]; int free_blocks_req = 0, free_blocks_avail, num_of_chunks_req = 0, group; char str[10]; float orig_avail_req, avail_req; int set_till_now, free_till_now, num_blks_to_set, num_blks_to_free, phase; int current, align_flag = 0, align = 0, curr = 0; if (argc < 2) { printf("Please give name of a filesystem. Exiting...\n"); return -1; } /* Even from user's perspective */ if(argc & 0x01) { printf("This utility cannot have even number of arguments.\n"); return -1; } if ((retval = ext2fs_open(argv[1], EXT2_FLAG_RW, 0, 0, unix_io_manager, &fs))) { com_err("ext2fs open:", retval, "while opening %s\n", argv[1]); return retval; } srand(1234567); chunk_num = 0; for (arg_num = 2; arg_num < argc; arg_num += 2, chunk_num++) { strcpy(str, argv[arg_num]); /* Check if we have to align */ if (toupper(str[strlen(str) - 1 ]) == 'A') { chunk[chunk_num].align = ALIGN; str[strlen(str) - 1] = '\0'; align = 1; } else chunk[chunk_num].align = NO_ALIGN; if (toupper(str[strlen(str) - 1]) == 'K') multiply = ONE_KB; else if(toupper(str[strlen(str) - 1]) == 'M') multiply = ONE_MB; str[strlen(str) - 1] = '\0'; chunk[chunk_num].chunk_size = ((strtod(str, NULL)) * multiply)/ (fs->blocksize); chunk[chunk_num].num_chunks = strtod(argv[arg_num + 1], NULL); free_blocks_req += chunk[chunk_num].chunk_size * chunk[chunk_num].num_chunks; num_of_chunks_req += chunk[chunk_num].num_chunks; } ext2fs_read_block_bitmap(fs); map = &fs->block_map; start_blk = fs->super->s_first_data_block; free_blocks_avail = fs->super->s_free_blocks_count; orig_avail_req = free_blocks_avail / free_blocks_req; current = 0; i = start_blk; num_blks_to_set = (orig_avail_req / 4) * chunk[current].chunk_size; num_blks_to_free = chunk[current].chunk_size; phase = SETTING; do { test_bit = i; if (!ext2fs_fast_test_block_bitmap(*map, test_bit)) { if (phase == SETTING) { if (chunk[current].align == ALIGN && chunk[current].num_chunks > 0) { if (align_flag == 0) { num_blks_to_set = (i / chunk[current].chunk_size + 1) * chunk[current].chunk_size - i; align_flag = 1; } else if (i % chunk[current].chunk_size == 0) { num_blks_to_set = 0; phase = FREEING; } } set_bit = i; ext2fs_mark_block_bitmap(*map, set_bit); group = (set_bit - fs->super->s_first_data_block) / fs->super->s_blocks_per_group; fs->group_desc[group].bg_free_blocks_count--; fs->super->s_free_blocks_count--; num_blks_to_set--; if (num_blks_to_set == 0) { phase = FREEING; align_flag = 0; } } else if (phase == FREEING) { free_blocks_req--; num_blks_to_free--; if (num_blks_to_free == 0) { /* Decide how many blocks to set */ phase = SETTING; num_of_chunks_req--; chunk[current].num_chunks--; /* No more free chunks required*/ if (num_of_chunks_req == 0) { num_blks_to_set = free_blocks_avail; } else { for (j = 0; j < chunk_num; j++) { if (chunk[j].num_chunks > 0) { if (free_blocks_req > chunk[j].num_chunks * chunk[j].chunk_size && current == j) { continue; } else { current = j; break; } } } avail_req = free_blocks_avail / free_blocks_req; if (align != 1) num_blks_to_set = (avail_req / 4) * chunk[current].chunk_size; else num_blks_to_set = 20; num_blks_to_free = chunk[current].chunk_size; /* Make sure a free block does not break across block groups */ curr = i % 32767; curr = 32767 * (curr + 1); if (i + num_blks_to_set + num_blks_to_free > curr && i < curr) num_blks_to_set += (curr) - (i + num_blks_to_set); } } } free_blocks_avail--; } i++; }while(i <= (fs->super->s_blocks_count - 1) || free_blocks_avail != 0); ext2fs_mark_bb_dirty(fs); ext2fs_mark_super_dirty(fs); if (i == fs->super->s_blocks_count && free_blocks_req != 0) { printf("Block manipulation failed. Sorry.\n"); return 0; } ext2fs_close(fs); return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-05-01 12:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur 2007-04-24 6:00 ` Alex Tomas 2007-04-24 14:04 ` Valerie Clement 2007-04-24 14:21 ` Alex Tomas 2007-04-24 14:51 ` Valerie Clement 2007-04-24 14:27 ` Eric Sandeen 2007-04-30 11:06 ` Aneesh Kumar 2007-04-30 11:13 ` Alex Tomas 2007-05-01 12:08 ` Kalpak Shah
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).