* Ininitial e2fsprogs TODO list (please expand) @ 2008-04-15 16:52 Jose R. Santos 2008-04-16 3:30 ` Andreas Dilger 2008-04-20 23:47 ` Theodore Tso 0 siblings, 2 replies; 8+ messages in thread From: Jose R. Santos @ 2008-04-15 16:52 UTC (permalink / raw) To: linux-ext4 As discuss on the call yesterday, some folks (my self included) really want a TODO list to help them keep track of what things are left undone in e2fsprogs as we try to get ext4 out the door. Here is my initial list of items that still need addressing. Hopefully we can expand this list and document it somewhere like the ext4 wiki or the SourceForge bug tracker. - Rename uninit_groups to uninit_bg to be consistent with other defined features. Retain the old name for historical purpose. - The return value of ext2fs_super_and_bgd_loc() is not to be trusted. Document this in the source code. - Make sure ext2fs_super_and_bgd_loc() does not get used anywhere where the return value is expected to be accurate (aside from mke2fs). - Remove lazy_bg feature from being set in mke2fs. Feature has been declare a dangerous hack by its creator, remove it to avoid people building on top of it. - Add flex_bg meta-data grouping support. - Remove support for not zeroing the inode tables from the uninit_groups patches. This support is dangerous without a proper kernel thread that zeros them in the background when the filesystem is mounted. Depends on the lazy_bg removal. - Activate undo-manager in mke2fs only when inode tables are not being zeroed. Undo-manager is horribly slow if we need to store the information of all the blocks that have been zeroed during mke2fs. The amount of storage needed for the undo on a 16TB filesystem could be problematic. Depends on kernel thread inode table zeroing. - Make a 64-bit clean API that extends the existing one. The current API can not support larger than 32-bit blocks so a new set API calls is need in order to provide large filesystem support and retain backwards compatibility with the old API. - 64-bit bitmap interface. In order to support larger than 32-bit blocks, a new bitmap interface is needed that can retain ABI compatibility with the old one. -JRS ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-15 16:52 Ininitial e2fsprogs TODO list (please expand) Jose R. Santos @ 2008-04-16 3:30 ` Andreas Dilger 2008-04-16 4:35 ` Jose R. Santos 2008-04-17 3:36 ` Theodore Tso 2008-04-20 23:47 ` Theodore Tso 1 sibling, 2 replies; 8+ messages in thread From: Andreas Dilger @ 2008-04-16 3:30 UTC (permalink / raw) To: Jose R. Santos; +Cc: linux-ext4 On Apr 15, 2008 11:52 -0500, Jose R. Santos wrote: > As discuss on the call yesterday, some folks (my self included) really > want a TODO list to help them keep track of what things are left undone > in e2fsprogs as we try to get ext4 out the door. Here is my initial > list of items that still need addressing. Hopefully we can expand this > list and document it somewhere like the ext4 wiki or the SourceForge > bug tracker. > > - Rename uninit_groups to uninit_bg to be consistent with other > defined features. Retain the old name for historical purpose. > > - The return value of ext2fs_super_and_bgd_loc() is not to be trusted. > Document this in the source code. > > - Make sure ext2fs_super_and_bgd_loc() does not get used anywhere where > the return value is expected to be accurate (aside from mke2fs). > > - Remove lazy_bg feature from being set in mke2fs. Feature has been > declare a dangerous hack by its creator, remove it to avoid people > building on top of it. > > - Add flex_bg meta-data grouping support. > > - Remove support for not zeroing the inode tables from the > uninit_groups patches. This support is dangerous without a proper > kernel thread that zeros them in the background when the filesystem is > mounted. Depends on the lazy_bg removal. Something was lost in translation here. The uninit_groups feature DOES zero the inode tables by default, and marks the groups with ITABLE_ZEROED. It is only if "-O uninit_groups,lazy_bg" are both given at the same time that the itable is not initialized. That is no different than if lazy_bg was given by itself. So nothing needs to be done in e2fsprogs until some time after the kernel is updated to do the zeroing. > - Activate undo-manager in mke2fs only when inode tables are not being > zeroed. Undo-manager is horribly slow if we need to store the > information of all the blocks that have been zeroed during mke2fs. The > amount of storage needed for the undo on a 16TB filesystem could be > problematic. Depends on kernel thread inode table zeroing. > > - Make a 64-bit clean API that extends the existing one. The current > API can not support larger than 32-bit blocks so a new set API calls is > need in order to provide large filesystem support and retain backwards > compatibility with the old API. > > - 64-bit bitmap interface. In order to support larger than 32-bit > blocks, a new bitmap interface is needed that can retain ABI > compatibility with the old one. There are some notes on implementing more efficient bitmaps in https://bugzilla.lustre.org/show_bug.cgi?id=12202 Even without 64-bit filesystems the memory consumption of e2fsck can be quite high (2^32 blocks ~= 2^32 bytes of RAM for e2fsck). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-16 3:30 ` Andreas Dilger @ 2008-04-16 4:35 ` Jose R. Santos 2008-04-17 3:26 ` Andreas Dilger 2008-04-17 3:36 ` Theodore Tso 1 sibling, 1 reply; 8+ messages in thread From: Jose R. Santos @ 2008-04-16 4:35 UTC (permalink / raw) To: Andreas Dilger; +Cc: linux-ext4 On Tue, 15 Apr 2008 21:30:02 -0600 Andreas Dilger <adilger@sun.com> wrote: > On Apr 15, 2008 11:52 -0500, Jose R. Santos wrote: > > As discuss on the call yesterday, some folks (my self included) really > > want a TODO list to help them keep track of what things are left undone > > in e2fsprogs as we try to get ext4 out the door. Here is my initial > > list of items that still need addressing. Hopefully we can expand this > > list and document it somewhere like the ext4 wiki or the SourceForge > > bug tracker. > > > > - Rename uninit_groups to uninit_bg to be consistent with other > > defined features. Retain the old name for historical purpose. > > > > - The return value of ext2fs_super_and_bgd_loc() is not to be trusted. > > Document this in the source code. > > > > - Make sure ext2fs_super_and_bgd_loc() does not get used anywhere where > > the return value is expected to be accurate (aside from mke2fs). > > > > - Remove lazy_bg feature from being set in mke2fs. Feature has been > > declare a dangerous hack by its creator, remove it to avoid people > > building on top of it. > > > > - Add flex_bg meta-data grouping support. > > > > - Remove support for not zeroing the inode tables from the > > uninit_groups patches. This support is dangerous without a proper > > kernel thread that zeros them in the background when the filesystem is > > mounted. Depends on the lazy_bg removal. > > Something was lost in translation here. The uninit_groups feature DOES > zero the inode tables by default, and marks the groups with ITABLE_ZEROED. > It is only if "-O uninit_groups,lazy_bg" are both given at the same time > that the itable is not initialized. That is no different than if lazy_bg > was given by itself. Yes, I understand this part. > So nothing needs to be done in e2fsprogs until some time after the kernel > is updated to do the zeroing. The problem is that not initializing the inode table on the uninit block group patch depends on a feature (lazy_bg) that Ted wants removed. I believe that just removing the lazy_bg feature would be enough to remove this capability from the uninit patch, but was not entirely sure so I put the item just to keep track of it. If lazy_bg is in fact removed from e2fsprogs, I suppose we need to add another item to enable lazy setup of the inode tables once the proper support in the kernel is establish. > > - Activate undo-manager in mke2fs only when inode tables are not being > > zeroed. Undo-manager is horribly slow if we need to store the > > information of all the blocks that have been zeroed during mke2fs. The > > amount of storage needed for the undo on a 16TB filesystem could be > > problematic. Depends on kernel thread inode table zeroing. > > > > - Make a 64-bit clean API that extends the existing one. The current > > API can not support larger than 32-bit blocks so a new set API calls is > > need in order to provide large filesystem support and retain backwards > > compatibility with the old API. > > > > - 64-bit bitmap interface. In order to support larger than 32-bit > > blocks, a new bitmap interface is needed that can retain ABI > > compatibility with the old one. > > There are some notes on implementing more efficient bitmaps in > https://bugzilla.lustre.org/show_bug.cgi?id=12202 > > Even without 64-bit filesystems the memory consumption of e2fsck > can be quite high (2^32 blocks ~= 2^32 bytes of RAM for e2fsck). > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > -JRS ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-16 4:35 ` Jose R. Santos @ 2008-04-17 3:26 ` Andreas Dilger 0 siblings, 0 replies; 8+ messages in thread From: Andreas Dilger @ 2008-04-17 3:26 UTC (permalink / raw) To: Jose R. Santos; +Cc: linux-ext4 On Apr 15, 2008 23:35 -0500, Jose R. Santos wrote: > On Tue, 15 Apr 2008 21:30:02 -0600 > Andreas Dilger <adilger@sun.com> wrote: > > Something was lost in translation here. The uninit_groups feature DOES > > zero the inode tables by default, and marks the groups with ITABLE_ZEROED. > > It is only if "-O uninit_groups,lazy_bg" are both given at the same time > > that the itable is not initialized. That is no different than if lazy_bg > > was given by itself. > > Yes, I understand this part. > > > So nothing needs to be done in e2fsprogs until some time after the kernel > > is updated to do the zeroing. > > The problem is that not initializing the inode table on the uninit > block group patch depends on a feature (lazy_bg) that Ted wants > removed. I believe that just removing the lazy_bg feature would be > enough to remove this capability from the uninit patch, but was not > entirely sure so I put the item just to keep track of it. > > If lazy_bg is in fact removed from e2fsprogs, I suppose we need to add > another item to enable lazy setup of the inode tables once the proper > support in the kernel is establish. Yes, the "lazy init" for uinint_groups will essentially be identical to what we have in lazy_bg today. So if we are disabling lazy_bg as a user-selectable option, we should leave the code in place for later use. I wouldn't object to requiring a user to specify "mke2fs -O FEATURE_C6" to enable it. That keeps it out of the hands of newbies, but leaves the capability to test large filesystems w/o 45 minute mke2fs times. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-16 3:30 ` Andreas Dilger 2008-04-16 4:35 ` Jose R. Santos @ 2008-04-17 3:36 ` Theodore Tso 1 sibling, 0 replies; 8+ messages in thread From: Theodore Tso @ 2008-04-17 3:36 UTC (permalink / raw) To: Jose R. Santos, Andreas Dilger; +Cc: linux-ext4 On Tue, Apr 15, 2008 at 11:52:16AM -0500, Jose R. Santos wrote: > As discuss on the call yesterday, some folks (my self included) really > want a TODO list to help them keep track of what things are left undone > in e2fsprogs as we try to get ext4 out the door. Here is my initial > list of items that still need addressing. Hopefully we can expand this > list and document it somewhere like the ext4 wiki or the SourceForge > bug tracker. > > - Rename uninit_groups to uninit_bg to be consistent with other > defined features. Retain the old name for historical purpose. Yes. Although until we actually don't do lazy initialization of the inode table, I still think the name is a bit of a misnomer. It really is more about checksuming the block group descriptors and a faster fsck, but whether or not we initialize the block groups or not is pretty much a non-issue. > - The return value of ext2fs_super_and_bgd_loc() is not to be trusted. > Document this in the source code. > > - Make sure ext2fs_super_and_bgd_loc() does not get used anywhere where > the return value is expected to be accurate (aside from mke2fs). > > - Remove lazy_bg feature from being set in mke2fs. Feature has been > declare a dangerous hack by its creator, remove it to avoid people > building on top of it. .... and to replace it, add a configuration parameter to /etc/e2fsck.conf which controls whether or not the inode table and bitmap blocks should be uninitialized when using uninit groups. It will default to off for now, until the kernel support can be implemented. > - Add flex_bg meta-data grouping support. Once it is demonstrated to work correctly in all circumstances. :-) - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-15 16:52 Ininitial e2fsprogs TODO list (please expand) Jose R. Santos 2008-04-16 3:30 ` Andreas Dilger @ 2008-04-20 23:47 ` Theodore Tso 2008-04-21 13:11 ` Theodore Tso 1 sibling, 1 reply; 8+ messages in thread From: Theodore Tso @ 2008-04-20 23:47 UTC (permalink / raw) To: Jose R. Santos; +Cc: linux-ext4 I found a badly out-of-date e2fsprogs todo page on the ext4 wiki, and I've updated with the todo items from this list. http://ext4.wiki.kernel.org/index.php/E2fsprogs_features_and_patches Some of the items marked "DONE" are in my tree and haven't been pushed out yet, but I'll make sure that happens by Monday. Note that I am taking the red eye from Sao Paulo tonight, and if all goes well, am scheduled to arrive in Boston at 10:15am Eastern. If the flight gets delayed, there is a chance that I may end up being late or missing the ext4 call. My intention is to try to get enough of the serious bugs fixed that we can release 1.41-rc0 early this week. The other thing that needs to happen is preparing the ext4 queue for pushing to Linus. - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-20 23:47 ` Theodore Tso @ 2008-04-21 13:11 ` Theodore Tso 2008-04-21 21:29 ` Eric Sandeen 0 siblings, 1 reply; 8+ messages in thread From: Theodore Tso @ 2008-04-21 13:11 UTC (permalink / raw) To: Jose R. Santos; +Cc: linux-ext4 On Sun, Apr 20, 2008 at 07:47:07PM -0400, Theodore Tso wrote: > Some of the items marked "DONE" are in my tree and haven't been pushed > out yet, but I'll make sure that happens by Monday. Note that I am > taking the red eye from Sao Paulo tonight, and if all goes well, am > scheduled to arrive in Boston at 10:15am Eastern. If the flight gets > delayed, there is a chance that I may end up being late or missing the > ext4 call. Unfortunately, we were delayed in Sao Paulo for over three hours; something about a problem with one of the fuel pumps.... So I've been rebooked onto another flight which means I'll be in the air at the time of the ext4 call. While I was stuck on the airplane, I spent some time doing more fixups on the uninit_bg code to make it much cleaner and more robust, and I also started rototilling the undo_mgr patches. In addition to fixing numerous style and usability problems, I also found the design problem which caused it to be so slow. It is using the first blocksize used to write to the device as the tdb_data_size. For mke2fs, this is 512 bytes, which means that for every single 4k inode table clock write, *eight* entries were getting made into the tdb database and the old contents of the filesystem were getting stored in 512 byte chunks. No wonder it was so slow!! I was able to show significant speedups by forcing the tdb_data_size to be the filesystem blocksize, and I suspect that for mke2fs, if it is initializing the inode table, using a tdb_data_size of something like 32k or 64k would be even better. Unfortunately I haven't made any progress on doing quality checking the patches in the patch queue, since I found so much new code that just screamed out for fixing in e2fsprogs. Eric, if you have time, could you look through the patch queue and help out with sanity-checking the patches and making sure the patch descriptions are suitably well-written without version control logs, XXX FIXME comments, or other things that would make Linus vomit? If you could, I'd really appreciate it. Thanks!! - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ininitial e2fsprogs TODO list (please expand) 2008-04-21 13:11 ` Theodore Tso @ 2008-04-21 21:29 ` Eric Sandeen 0 siblings, 0 replies; 8+ messages in thread From: Eric Sandeen @ 2008-04-21 21:29 UTC (permalink / raw) To: Theodore Tso; +Cc: Jose R. Santos, linux-ext4 Theodore Tso wrote: > Unfortunately I haven't made any progress on doing quality checking > the patches in the patch queue, since I found so much new code that > just screamed out for fixing in e2fsprogs. Eric, if you have time, > could you look through the patch queue and help out with > sanity-checking the patches and making sure the patch descriptions are > suitably well-written without version control logs, XXX FIXME > comments, or other things that would make Linus vomit? If you could, > I'd really appreciate it. Thanks!! I'll put it on the list :) spent today doing more RHEL-related stuff... -Eric ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-04-21 21:37 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-15 16:52 Ininitial e2fsprogs TODO list (please expand) Jose R. Santos 2008-04-16 3:30 ` Andreas Dilger 2008-04-16 4:35 ` Jose R. Santos 2008-04-17 3:26 ` Andreas Dilger 2008-04-17 3:36 ` Theodore Tso 2008-04-20 23:47 ` Theodore Tso 2008-04-21 13:11 ` Theodore Tso 2008-04-21 21:29 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).