* Default ext inode size @ 2008-11-13 19:56 Phillip Susi 2008-11-13 20:14 ` Kalpak Shah 2008-11-13 20:21 ` Theodore Tso 0 siblings, 2 replies; 5+ messages in thread From: Phillip Susi @ 2008-11-13 19:56 UTC (permalink / raw) To: linux-fsdevel I noticed that the default inode size for mkfs in e2fsprogs has been changed to 256 bytes. I noticed this because I am seeing users complain that they can no longer access their ext partitions using the windows driver, which only supports normal 128 byte inodes. I'd like to know why this default was changed. As I understand it, the larger inode size means that ea/acl can be stored directly in the inode. Are there any other benefits? It seems that using extended attributes is rather uncommon in the first place, and that when they are used, many files often share them so it would be better to leave them in the shared data block rather than duplicate them in every inode. This leaves me wondering where is the common case that benefits from a larger default inode? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Default ext inode size 2008-11-13 19:56 Default ext inode size Phillip Susi @ 2008-11-13 20:14 ` Kalpak Shah 2008-11-13 20:21 ` Theodore Tso 1 sibling, 0 replies; 5+ messages in thread From: Kalpak Shah @ 2008-11-13 20:14 UTC (permalink / raw) To: Phillip Susi; +Cc: linux-fsdevel On Fri, Nov 14, 2008 at 1:26 AM, Phillip Susi <psusi@cfl.rr.com> wrote: > I noticed that the default inode size for mkfs in e2fsprogs has been changed > to 256 bytes. I noticed this because I am seeing users complain that they > can no longer access their ext partitions using the windows driver, which > only supports normal 128 byte inodes. I'd like to know why this default was > changed. > > As I understand it, the larger inode size means that ea/acl can be stored > directly in the inode. Are there any other benefits? It seems that using > extended attributes is rather uncommon in the first place, and that when > they are used, many files often share them so it would be better to leave > them in the shared data block rather than duplicate them in every inode. > This leaves me wondering where is the common case that benefits from a > larger default inode? The larger inode is also needed to support new features like nanosecond timestamps, creation time, 64-bit inode versions. If you want to override the 256-byte inode default, you can use "-I 128" while formatting your filesystems with mke2fs. Thanks, Kalpak ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Default ext inode size 2008-11-13 19:56 Default ext inode size Phillip Susi 2008-11-13 20:14 ` Kalpak Shah @ 2008-11-13 20:21 ` Theodore Tso 2008-11-13 20:50 ` Phillip Susi 1 sibling, 1 reply; 5+ messages in thread From: Theodore Tso @ 2008-11-13 20:21 UTC (permalink / raw) To: Phillip Susi; +Cc: linux-fsdevel On Thu, Nov 13, 2008 at 02:56:38PM -0500, Phillip Susi wrote: > I noticed that the default inode size for mkfs in e2fsprogs has been > changed to 256 bytes. I noticed this because I am seeing users complain > that they can no longer access their ext partitions using the windows > driver, which only supports normal 128 byte inodes. I'd like to know > why this default was changed. > > As I understand it, the larger inode size means that ea/acl can be > stored directly in the inode. Are there any other benefits? That's the main one. The other benefit is that ext4 uses a bigger inode to store some extra fields such as the file creation time, nanosecond timestamps, and the 64-bit version number neede which is used for NFSv4's client-side caching. > It seems > that using extended attributes is rather uncommon in the first place, The big user of extended attribute is SELinux, Samba, and Beagle. Since a number of distributions are now starting to enable SELinux by default (for better or for worse), it makes a big difference from a performance perspective for those distributions. I can't imagine that it would be that hard to fix the Windows driver to be able to support 258 byte inodes. It should be a one- or two-line fix, for those people who care. - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Default ext inode size 2008-11-13 20:21 ` Theodore Tso @ 2008-11-13 20:50 ` Phillip Susi 2008-11-13 21:35 ` Theodore Tso 0 siblings, 1 reply; 5+ messages in thread From: Phillip Susi @ 2008-11-13 20:50 UTC (permalink / raw) To: Theodore Tso; +Cc: linux-fsdevel Theodore Tso wrote: > That's the main one. The other benefit is that ext4 uses a bigger > inode to store some extra fields such as the file creation time, > nanosecond timestamps, and the 64-bit version number neede which is > used for NFSv4's client-side caching. What about ext3? Does it do the same thing with the larger inode? Does it get more block pointers or room for block extent lists ( in ext4 )? Are the higher resolution timestamps used by default if the inode is large, or is there a compatibility bit and/or mount option that needs set? > The big user of extended attribute is SELinux, Samba, and Beagle. > Since a number of distributions are now starting to enable SELinux by > default (for better or for worse), it makes a big difference from a > performance perspective for those distributions. Does it actually help performance to store the ae in the inode? I would think it would not make much difference if many files have the same attributes, then the shared ea block would be cached. Storing the ea in the inode seems like it duplicates a lot of data and means a given amount of ram could only cache half as many inodes as with the normal size, which would lead to less cache hits and more disk IO. > I can't imagine that it would be that hard to fix the Windows driver > to be able to support 258 byte inodes. It should be a one- or > two-line fix, for those people who care. Probably, but it is an example ( and there probably are others ) of problems caused by changing the default, so I'm trying to understand why ext3 was disturbed in this way rather than just make 256 byte inodes the default for only ext4. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Default ext inode size 2008-11-13 20:50 ` Phillip Susi @ 2008-11-13 21:35 ` Theodore Tso 0 siblings, 0 replies; 5+ messages in thread From: Theodore Tso @ 2008-11-13 21:35 UTC (permalink / raw) To: Phillip Susi; +Cc: linux-fsdevel On Thu, Nov 13, 2008 at 03:50:08PM -0500, Phillip Susi wrote: > Theodore Tso wrote: >> That's the main one. The other benefit is that ext4 uses a bigger >> inode to store some extra fields such as the file creation time, >> nanosecond timestamps, and the 64-bit version number neede which is >> used for NFSv4's client-side caching. > > What about ext3? Does it do the same thing with the larger inode? Does > it get more block pointers or room for block extent lists ( in ext4 )? > Are the higher resolution timestamps used by default if the inode is > large, or is there a compatibility bit and/or mount option that needs > set? No, ext3 doesn't have any of these features; only ext4. It would be possible to backport things like the high res timestamps to ext3, but no one has done it to date. We don't actually use extra block extent lists in ext4 with the extra space. We could, but it's not clear how much it is necessary. On my laptop, which has been running ext4 since July, a recent check on my system showed that I had 1,058,309 inodes used, and of those roughly one million inodes used, 548 inodes had a extent depth of one, and exactly 2 inodes had an extent depth of two. All of the other inodes had 3 or fewer extents, so they all fit inside the current inode's direct block array. In fact, all but 10,876 inodes were contiguous (i.e., only needed one extent). Granted, my laptop is used as a development machine, which means the bulk of the files are Maildir directories, git repositories, and build trees, which might not be representative of say, server workloads. Still, statistics of 98.97% of the files being completely contiguous, and 99.95% of the files being able to store all of their extents inside the inode table and not needing to spill to an external extent tree are pretty impressive, and a good reason for folks to migrate to ext4 when they have a chance. :-) > Does it actually help performance to store the ae in the inode? I would > think it would not make much difference if many files have the same > attributes, then the shared ea block would be cached. I don't use SE Linux, so I can't speak to this from personal experience. People who have enabled it have told me the performance difference is "dramatic". It also seemed that with the advent of desktop programs like Beagle, there were more desktop applications using file-unique extended attributes, and that my personal battle to tell application programs that "friends don't let friends use extended attributes" was a losing fight, and given that the GNOME and KDE application programs have filesystems engineers vastly outnumbered, it was better to assume that we weren't going to win this one, as they pick up bad programming habits from platforms such as Mac OS X. :-) > Probably, but it is an example ( and there probably are others ) of > problems caused by changing the default, so I'm trying to understand why > ext3 was disturbed in this way rather than just make 256 byte inodes the > default for only ext4. We didn't "disturb" ext3; we just changed the default, to reflect the changing usage of filesystems. If a system administrator is convinced that thye know better, they can always adjust /etc/mke2fs.conf, and change the default to something else. I know of only one other problem that was turned up when we changed the default, which was that some boot loaders didn't know how to deal with 256 byte inodes. But that got fixed pretty fast. Realistically, any userspace program that uses libext2fs would have been fine, since it always did the right thing with larger inode sizes. It was only a programs that didn't use the standard ext2 library that would get bitten, and there are very few of those around. - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-11-13 21:35 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-13 19:56 Default ext inode size Phillip Susi 2008-11-13 20:14 ` Kalpak Shah 2008-11-13 20:21 ` Theodore Tso 2008-11-13 20:50 ` Phillip Susi 2008-11-13 21:35 ` Theodore Tso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).