* Ext4 devel interlock meeting minutes (April 23, 2007)
@ 2007-04-23 23:35 Avantika Mathur
2007-04-24 6:00 ` Alex Tomas
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Avantika Mathur @ 2007-04-23 23:35 UTC (permalink / raw)
To: linux-ext4
Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
Attendees: Mingming Cao, Dave Kleikamp, Avantika Mathur, Ted Ts'o,
Suparna Bhattacharya,
Jean-Pierre Dion, Jean Noel Cordenner, Valérie Clément, Jose Santos
Minutes can be accessed at:
http://ext4.wiki.kernel.org/index.php/Ext4_Developer%27s_Conference_Call
- Mingming proposed moving back to 8am PST meeting time, since the 6am
time is inconvenient for
a few people. This discussion will be continued through email, to find
a time which works
for everyone.
- Next week's meeting will be canceled, unless there is anyone who would
like to request a meeting.
PATCH STATUS
git-tree
- Mingming will be updating the git tree with extents-fix patches from
Alex, i_flags patch from Honza, i_extra_isize patch from Kalpak.
Uninitialized Block Groups:
- The patch sent out by Andreas is against 2.6.16 and ext3. Need to
port this to current ext4, test and then add to git-tree. Avantika will
ask Andreas if he needs help with this.
JBD statistics:
- There is a patch to export JDB statistics to /proc. In order to get
this patch to mainline, there needs to be discussion about the correct
place for the statistics; /proc or perhaps debugfs.
e2fsprogs:
- Ted will post the current e2fsprogs patches in progress. Ted has been
working with these patches and making changes.
- Main work areas for making e2fsprogs compatible with extents and 64-bit.
- block iterator: make a block iterator work with both extent and
non-extent code. Code that is oblivious to extents will still work with
the block iterator. This has been written by Andreas Dilger.
- extents: in order to preserve ABI compatibility, support for a new
interface for extents which uses 64-bit logical and physical block
numbers. The block iterator then translate from on-disk to in-memory
format. This will allow for possible future increases of physical and
logical block sizes in extents, without breaking ABI.
- bitmaps in e2fsprogs: this will be discussed in more detail at the
next meeting, after people have a chance to read related email.
preallocation:
- fallocate syscall interface: the current plan, based on discussions
on the mailing list, is to create a separate wrapper for s390 in glibc.
Using regular parameter ordering for all other architectures, but a
different order on s390. Jakub Jelinek has said that the changes in
glibc can be made pretty easily.
- The preallocation patches in the ext4 git-tree are outdated, using
the ioctl interface. Once Amit re-posts the patches with the syscall
interface, they will be updated in the git-tree as well.
- Mingming mentioned the need to flush preallocation metadata changes to
disk if file size or file content is being tested. Discussed doing an
fsync at Bmap time.
TESTING
- extents testing
- Discussed methods for testing extents on highly fragmented
filesystems.
- Jose will look into possible tests, including perhaps using the
'aged' option in FFSB
- Ted suggested creating a mountoption that creates a bad block
allocator which it jumps to a new block group every 8 blocks. This
would force a very large number of extents, and may be a good test for
extents.
- large filesystem
- We would like to perform more testing on large (>16TB) filesystems
- currently hardware limitations are preventing this testing. We
have tested 10TB raid dists, and 16TB loopback devices. Avantika will
look into creating very large sparse devices for testing.
- Large file deletion
- Valerie had recently tested large file deletion on ext3/4, but did
not see the expected performance gain with ext4 due to compact metadata
when using extents.
- Valerie will try re-running the test. Jose will also be looking
into this test.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur
@ 2007-04-24 6:00 ` Alex Tomas
2007-04-24 14:04 ` Valerie Clement
2007-04-24 14:27 ` Eric Sandeen
2007-04-30 11:06 ` Aneesh Kumar
2 siblings, 1 reply; 9+ messages in thread
From: Alex Tomas @ 2007-04-24 6:00 UTC (permalink / raw)
To: Avantika Mathur; +Cc: linux-ext4
Avantika Mathur wrote:
> TESTING
> - extents testing
> - Discussed methods for testing extents on highly fragmented
> filesystems.
> - Jose will look into possible tests, including perhaps using the
> 'aged' option in FFSB
> - Ted suggested creating a mountoption that creates a bad block
> allocator which it jumps to a new block group every 8 blocks. This
> would force a very large number of extents, and may be a good test for
> extents.
there is AGGRESSIVE_TEST define which limits number of entries in index/leaf.
> - Large file deletion
> - Valerie had recently tested large file deletion on ext3/4, but did
> not see the expected performance gain with ext4 due to compact metadata
> when using extents.
any details?
thanks, Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-24 6:00 ` Alex Tomas
@ 2007-04-24 14:04 ` Valerie Clement
2007-04-24 14:21 ` Alex Tomas
0 siblings, 1 reply; 9+ messages in thread
From: Valerie Clement @ 2007-04-24 14:04 UTC (permalink / raw)
To: Alex Tomas; +Cc: Avantika Mathur, linux-ext4, Mingming Cao
Alex Tomas wrote:
>> - Large file deletion
>> - Valerie had recently tested large file deletion on ext3/4, but
>> did not see the expected performance gain with ext4 due to compact
>> metadata when using extents.
>
> any details?
>
Ok, I found my mistake. There was a typo in my test script and the
pagecache was not flushed between the file creation and the deletion.
Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB
file:
ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
xfs : real 0m0.377s user 0m0.004s sys 0m0.004s
The performance gain with ext4 is much larger when running a good test...
Sorry the wrong information,
Valérie
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-24 14:04 ` Valerie Clement
@ 2007-04-24 14:21 ` Alex Tomas
2007-04-24 14:51 ` Valerie Clement
0 siblings, 1 reply; 9+ messages in thread
From: Alex Tomas @ 2007-04-24 14:21 UTC (permalink / raw)
To: Valerie Clement; +Cc: Avantika Mathur, linux-ext4, Mingming Cao
Valerie Clement wrote:
> Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB
> file:
>
> ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
> ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
> xfs : real 0m0.377s user 0m0.004s sys 0m0.004s
would be very interesting to know how much IO was done to remove the file
and actual fragmentation in all the cases.
thanks, Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur
2007-04-24 6:00 ` Alex Tomas
@ 2007-04-24 14:27 ` Eric Sandeen
2007-04-30 11:06 ` Aneesh Kumar
2 siblings, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2007-04-24 14:27 UTC (permalink / raw)
To: Avantika Mathur; +Cc: linux-ext4
Avantika Mathur wrote:
> - large filesystem
> - We would like to perform more testing on large (>16TB) filesystems
> - currently hardware limitations are preventing this testing. We
> have tested 10TB raid dists, and 16TB loopback devices. Avantika will
> look into creating very large sparse devices for testing.
I've been hacking up some ext3@16T testing scripts to use sparse
devicemapper devices which make use of snapshots... loopback files don't
work for testing, at least not hosted on ext[234], because we still
can't do these large file offsets.
(Documentation/device-mapper/zero.txt in the kernel tree describes these
sparse dm devices)
Testing the whole range as a sparse snapshot can be slow, since
devicemapper has to do all the exception handling etc, and I think
essentially creates a fragmented block device.
I've been playing with something like this:
# 90% of the real device size is used for a "real" 1:1 mapping
# The other 10% is sparsely mapped out to add up to totalsize.
# i.e. -
# [large sparse-ish device]
#
# +----------------------~ ~-----------------------------------------+
# | sparse | real |
# +----------------------~ ~-----------------------------------------+
#
# |<------------ SPARSE_SIZE ---------------->|<----- REAL_SIZE ----->|
# is mapped on top of:
# [real block device]
# +----------------------------+
# | sp | real |
# +----------------------------+
and then marking the sparse range as full (maybe via lazy_bg, or other
methods). You could then also put a dm-error target under the "full"
sections so that any IO that may stray there will fail.
This way you can direct the real IO to the 1:1 mapping portion of the
large dm device, and shouldn't get the snapshot slowdowns.
Anyway, just something I've been playing with...
-eric
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-24 14:21 ` Alex Tomas
@ 2007-04-24 14:51 ` Valerie Clement
0 siblings, 0 replies; 9+ messages in thread
From: Valerie Clement @ 2007-04-24 14:51 UTC (permalink / raw)
To: Alex Tomas; +Cc: Avantika Mathur, linux-ext4, Mingming Cao
Alex Tomas wrote:
> Valerie Clement wrote:
>> Here are the results I obtain with a 2.6.17-rc7 kernel to delete a
>> 100GB file:
>>
>> ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
>> ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
>> xfs : real 0m0.377s user 0m0.004s sys 0m0.004s
>
> would be very interesting to know how much IO was done to remove the file
> and actual fragmentation in all the cases.
>
> thanks, Alex
>
Ok, I will do it.
Valérie
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur
2007-04-24 6:00 ` Alex Tomas
2007-04-24 14:27 ` Eric Sandeen
@ 2007-04-30 11:06 ` Aneesh Kumar
2007-04-30 11:13 ` Alex Tomas
2007-05-01 12:08 ` Kalpak Shah
2 siblings, 2 replies; 9+ messages in thread
From: Aneesh Kumar @ 2007-04-30 11:06 UTC (permalink / raw)
To: Avantika Mathur; +Cc: linux-ext4
On 4/24/07, Avantika Mathur <mathur@linux.vnet.ibm.com> wrote:
> Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
>
> TESTING
> - extents testing
> - Discussed methods for testing extents on highly fragmented
> filesystems.
> - Jose will look into possible tests, including perhaps using the
> 'aged' option in FFSB
> - Ted suggested creating a mountoption that creates a bad block
> allocator which it jumps to a new block group every 8 blocks. This
> would force a very large number of extents, and may be a good test for
> extents.
What i am doing for creating a large number of extents is
dd if=/dev/zero of=myfile count=10
seek=20
while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
seek=`expr $seek + 20`; done
-aneesh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-30 11:06 ` Aneesh Kumar
@ 2007-04-30 11:13 ` Alex Tomas
2007-05-01 12:08 ` Kalpak Shah
1 sibling, 0 replies; 9+ messages in thread
From: Alex Tomas @ 2007-04-30 11:13 UTC (permalink / raw)
To: Aneesh Kumar; +Cc: Avantika Mathur, linux-ext4
Aneesh Kumar wrote:
> What i am doing for creating a large number of extents is
>
> dd if=/dev/zero of=myfile count=10
> seek=20
> while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
> seek=`expr $seek + 20`; done
with AGGRESSIVE_TEST defined in include/linux/ext4_fs_extents.h you may
get much more extents and index blocks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ext4 devel interlock meeting minutes (April 23, 2007)
2007-04-30 11:06 ` Aneesh Kumar
2007-04-30 11:13 ` Alex Tomas
@ 2007-05-01 12:08 ` Kalpak Shah
1 sibling, 0 replies; 9+ messages in thread
From: Kalpak Shah @ 2007-05-01 12:08 UTC (permalink / raw)
To: Aneesh Kumar; +Cc: Avantika Mathur, linux-ext4
[-- Attachment #1: Type: text/plain, Size: 1662 bytes --]
On Mon, 2007-04-30 at 16:36 +0530, Aneesh Kumar wrote:
> On 4/24/07, Avantika Mathur <mathur@linux.vnet.ibm.com> wrote:
> > Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
> >
> > TESTING
> > - extents testing
> > - Discussed methods for testing extents on highly fragmented
> > filesystems.
> > - Jose will look into possible tests, including perhaps using the
> > 'aged' option in FFSB
> > - Ted suggested creating a mountoption that creates a bad block
> > allocator which it jumps to a new block group every 8 blocks. This
> > would force a very large number of extents, and may be a good test for
> > extents.
>
>
> What i am doing for creating a large number of extents is
>
> dd if=/dev/zero of=myfile count=10
> seek=20
> while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
> seek=`expr $seek + 20`; done
>
>
I had written a simple tool "bitmap_manip" with which you can actually
manipulate the number of free chunks and their sizes in a filesystem. It
uses libext2fs to set the bits in block bitmaps thereby leaving the
desired free extents. I had written it to test the allocators
performance.
It can be used as:
./bitmap_manip /dev/sda9 1MA 4 16K 1 12K 3 8K 4 4K 6
This will leave only 1 16K chunk, 3 12K chunks, .... free in the
filesystem. "1MA" 4 will get us 4 1Mb free ALIGNED chunks.
It isn't very beautiful code since it was only used for testing but
maybe it can help.
Thanks,
Kalpak.
> -aneesh
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: bitmap_manip.c --]
[-- Type: text/x-csrc, Size: 5074 bytes --]
/* Manipulate block bitmap directly for mballoc testing */
/* USAGE:
* ./bitmap_manip /dev/volmballoc/test 16K 1 12K 3 8K 4 4K 6
* This will leave 1 16K chunk, 3 12K chunks, .... in the filesystem specified.
* Ideally give the inputs in ascending order.
* 1MA 4 will get us 4 1Mb ALIGNED chunks.
*/
#include <stdio.h>
#include <ext2fs/ext2fs.h>
#include <ext2fs/ext2_types.h>
#include <fcntl.h>
#include <stdlib.h>
#define ONE_MB (1024 * 1024)
#define ONE_KB 1024
#define SETTING 0
#define FREEING 1
#define NO_ALIGN 0
#define ALIGN 1
struct chunk_arg {
int chunk_size;
int num_chunks;
int align;
};
int main(int argc, char **argv)
{
ext2_filsys fs;
ext2fs_block_bitmap *map = NULL;
int bg_num = 0, retval, arg_num, multiply, chunk_num;
int i, start_blk, set_bit, test_bit, j;
struct chunk_arg chunk[50];
int free_blocks_req = 0, free_blocks_avail, num_of_chunks_req = 0, group;
char str[10];
float orig_avail_req, avail_req;
int set_till_now, free_till_now, num_blks_to_set, num_blks_to_free, phase;
int current, align_flag = 0, align = 0, curr = 0;
if (argc < 2) {
printf("Please give name of a filesystem. Exiting...\n");
return -1;
}
/* Even from user's perspective */
if(argc & 0x01) {
printf("This utility cannot have even number of arguments.\n");
return -1;
}
if ((retval = ext2fs_open(argv[1], EXT2_FLAG_RW, 0, 0, unix_io_manager, &fs))) {
com_err("ext2fs open:", retval, "while opening %s\n", argv[1]);
return retval;
}
srand(1234567);
chunk_num = 0;
for (arg_num = 2; arg_num < argc; arg_num += 2, chunk_num++) {
strcpy(str, argv[arg_num]);
/* Check if we have to align */
if (toupper(str[strlen(str) - 1 ]) == 'A') {
chunk[chunk_num].align = ALIGN;
str[strlen(str) - 1] = '\0';
align = 1;
}
else
chunk[chunk_num].align = NO_ALIGN;
if (toupper(str[strlen(str) - 1]) == 'K')
multiply = ONE_KB;
else if(toupper(str[strlen(str) - 1]) == 'M')
multiply = ONE_MB;
str[strlen(str) - 1] = '\0';
chunk[chunk_num].chunk_size = ((strtod(str, NULL)) * multiply)/ (fs->blocksize);
chunk[chunk_num].num_chunks = strtod(argv[arg_num + 1], NULL);
free_blocks_req += chunk[chunk_num].chunk_size * chunk[chunk_num].num_chunks;
num_of_chunks_req += chunk[chunk_num].num_chunks;
}
ext2fs_read_block_bitmap(fs);
map = &fs->block_map;
start_blk = fs->super->s_first_data_block;
free_blocks_avail = fs->super->s_free_blocks_count;
orig_avail_req = free_blocks_avail / free_blocks_req;
current = 0;
i = start_blk;
num_blks_to_set = (orig_avail_req / 4) * chunk[current].chunk_size;
num_blks_to_free = chunk[current].chunk_size;
phase = SETTING;
do {
test_bit = i;
if (!ext2fs_fast_test_block_bitmap(*map, test_bit)) {
if (phase == SETTING) {
if (chunk[current].align == ALIGN && chunk[current].num_chunks > 0) {
if (align_flag == 0) {
num_blks_to_set = (i / chunk[current].chunk_size + 1) *
chunk[current].chunk_size - i;
align_flag = 1;
}
else if (i % chunk[current].chunk_size == 0) {
num_blks_to_set = 0;
phase = FREEING;
}
}
set_bit = i;
ext2fs_mark_block_bitmap(*map, set_bit);
group = (set_bit - fs->super->s_first_data_block) / fs->super->s_blocks_per_group;
fs->group_desc[group].bg_free_blocks_count--;
fs->super->s_free_blocks_count--;
num_blks_to_set--;
if (num_blks_to_set == 0) {
phase = FREEING;
align_flag = 0;
}
}
else if (phase == FREEING) {
free_blocks_req--;
num_blks_to_free--;
if (num_blks_to_free == 0) {
/* Decide how many blocks to set */
phase = SETTING;
num_of_chunks_req--;
chunk[current].num_chunks--;
/* No more free chunks required*/
if (num_of_chunks_req == 0) {
num_blks_to_set = free_blocks_avail;
}
else {
for (j = 0; j < chunk_num; j++) {
if (chunk[j].num_chunks > 0) {
if (free_blocks_req > chunk[j].num_chunks *
chunk[j].chunk_size && current == j) {
continue;
}
else {
current = j;
break;
}
}
}
avail_req = free_blocks_avail / free_blocks_req;
if (align != 1)
num_blks_to_set = (avail_req / 4) *
chunk[current].chunk_size;
else
num_blks_to_set = 20;
num_blks_to_free = chunk[current].chunk_size;
/* Make sure a free block does not break across block groups */
curr = i % 32767;
curr = 32767 * (curr + 1);
if (i + num_blks_to_set + num_blks_to_free > curr && i < curr)
num_blks_to_set += (curr) - (i + num_blks_to_set);
}
}
}
free_blocks_avail--;
}
i++;
}while(i <= (fs->super->s_blocks_count - 1) || free_blocks_avail != 0);
ext2fs_mark_bb_dirty(fs);
ext2fs_mark_super_dirty(fs);
if (i == fs->super->s_blocks_count && free_blocks_req != 0) {
printf("Block manipulation failed. Sorry.\n");
return 0;
}
ext2fs_close(fs);
return 0;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-05-01 12:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-23 23:35 Ext4 devel interlock meeting minutes (April 23, 2007) Avantika Mathur
2007-04-24 6:00 ` Alex Tomas
2007-04-24 14:04 ` Valerie Clement
2007-04-24 14:21 ` Alex Tomas
2007-04-24 14:51 ` Valerie Clement
2007-04-24 14:27 ` Eric Sandeen
2007-04-30 11:06 ` Aneesh Kumar
2007-04-30 11:13 ` Alex Tomas
2007-05-01 12:08 ` Kalpak Shah
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).