* Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) @ 2025-12-10 7:02 Winston Wen 2025-12-10 9:05 ` Theodore Tso 0 siblings, 1 reply; 8+ messages in thread From: Winston Wen @ 2025-12-10 7:02 UTC (permalink / raw) To: linux-ext4 Hello ext4 maintainers and community, I am writing to seek your advice on a potential enhancement regarding filename length support in ext4. Currently, ext4 supports filenames up to 255 bytes, which is sufficient for most use cases. However, in cross-platform scenarios, particularly when migrating directories from Windows to Linux, we encounter issues with filenames that exceed this limit. Windows allows filenames longer than 256 bytes (including multi-byte characters such as Chinese), which can lead to filename overflow when copying such files to ext4. We are aware that workarounds like wrapfs can be used to support longer filenames, but in practice, this approach is not ideal for seamless user experience. We are therefore curious whether it would be feasible to implement built-in support for longer filenames in ext4 itself. One idea we considered is using extended attributes (xattr) to map long filenames, or perhaps another mechanism that would allow storing and accessing filenames beyond the current limit without breaking existing compatibility. However, we are not experts in this area and would appreciate guidance from the community. Could you share your thoughts on: - Whether there is interest in supporting longer filenames in ext4 natively. - Possible implementation approaches (e.g., xattr-based mapping, on-disk format extensions, etc.). - Any prior discussions or attempts in this direction that we might have missed. If there is a feasible path forward, we are willing to research the issue in depth and attempt to implement an RFC patch for community review. We would greatly appreciate your guidance on where to start and what the key considerations would be. Thank you for your time and insights. -- Thanks, Winston ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-10 7:02 Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) Winston Wen @ 2025-12-10 9:05 ` Theodore Tso 2025-12-10 9:32 ` Winston Wen 0 siblings, 1 reply; 8+ messages in thread From: Theodore Tso @ 2025-12-10 9:05 UTC (permalink / raw) To: Winston Wen; +Cc: linux-ext4 On Wed, Dec 10, 2025 at 03:02:11PM +0800, Winston Wen wrote: > We are aware that workarounds like wrapfs can be used to support longer > filenames, but in practice, this approach is not ideal for seamless > user experience. We are therefore curious whether it would be feasible > to implement built-in support for longer filenames in ext4 itself. I don't think wrapfs can be used to support logner file names, because the limitation is quite fundamental. For example, the glibc definition of struct dirent (which is returned by the readdir() system call) is as follows (from the man readdir page): struct dirent { ino_t d_ino; /* Inode number */ off_t d_off; /* Not an offset; see below */ unsigned short d_reclen; /* Length of this record */ unsigned char d_type; /* Type of file; not supported by all filesystem types */ char d_name[256]; /* Null-terminated filename */ }; So how you might store the longer file name isn't really going to help, the problem goes far beyond the question of where this might be stored on the file system. - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-10 9:05 ` Theodore Tso @ 2025-12-10 9:32 ` Winston Wen 2025-12-10 23:24 ` Theodore Tso 0 siblings, 1 reply; 8+ messages in thread From: Winston Wen @ 2025-12-10 9:32 UTC (permalink / raw) To: Theodore Tso; +Cc: linux-ext4 On Wed, 10 Dec 2025 18:05:36 +0900 "Theodore Tso" <tytso@mit.edu> wrote: > On Wed, Dec 10, 2025 at 03:02:11PM +0800, Winston Wen wrote: > > We are aware that workarounds like wrapfs can be used to support > > longer filenames, but in practice, this approach is not ideal for > > seamless user experience. We are therefore curious whether it would > > be feasible to implement built-in support for longer filenames in > > ext4 itself. > > I don't think wrapfs can be used to support logner file names, because > the limitation is quite fundamental. For example, the glibc > definition of struct dirent (which is returned by the readdir() system > call) is as follows (from the man readdir page): > > struct dirent { > ino_t d_ino; /* Inode number */ > off_t d_off; /* Not an offset; see > below */ unsigned short d_reclen; /* Length of this record */ > unsigned char d_type; /* Type of file; not > supported by all filesystem types */ > char d_name[256]; /* Null-terminated > filename */ }; > > So how you might store the longer file name isn't really going to > help, the problem goes far beyond the question of where this might be > stored on the file system. > > - Ted > Hi Ted, Thank you for your quick and insightful reply. I apologize if I’ve misunderstood something, but based on our experience, we have actually implemented and deployed two different solutions using FUSE and wrapfs in our production environment, both of which successfully support filenames longer than 256 bytes. This leads us to believe that the glibc and VFS layers do not impose a hard limit at 256 bytes in practice. To better understand, I’ve reviewed the readdir/getdents man pages and the glibc struct dirent definition. It appears that d_name is implemented as a flexible array member rather than a fixed-size array of 256 bytes. Going back to our original question: we were curious whether it might be possible to support longer filenames natively within ext4 itself (rather than through FUSE), perhaps via on-disk format extension or auxiliary storage like xattrs. If this is architecturally feasible, we would be very interested in exploring it further. Any further guidance or references you could share would be greatly appreciated. Thanks again for your time. -- Thanks, Winston ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-10 9:32 ` Winston Wen @ 2025-12-10 23:24 ` Theodore Tso 2025-12-11 1:21 ` Winston Wen 2025-12-12 18:10 ` Andi Kleen 0 siblings, 2 replies; 8+ messages in thread From: Theodore Tso @ 2025-12-10 23:24 UTC (permalink / raw) To: Winston Wen; +Cc: linux-ext4 On Wed, Dec 10, 2025 at 05:32:02PM +0800, Winston Wen wrote: > To better understand, I’ve reviewed the readdir/getdents man pages and > the glibc struct dirent definition. It appears that d_name is > implemented as a flexible array member rather than a fixed-size array > of 256 bytes. Intresting; the struture definition I quoted was from the readdir man page. I suspect there may be some number of random failures that might occur because of hidden dependencies on the historical / traditional value of NAME_MAX. For example, it might be OK for glibc; but what about other C libraries that ship on Linux, such as musl, dietlibc, bionic, etc.? > Going back to our original question: we were curious whether it might > be possible to support longer filenames natively within ext4 itself > (rather than through FUSE), perhaps via on-disk format extension or > auxiliary storage like xattrs. If this is architecturally feasible, we > would be very interested in exploring it further. Well, extended attributes won't work, because xattrs are associated with the inode, not the directory entry. So you need to handle cases where the file has multiple hard links. And if you are doing a lookup by long file name, there's a chicken and egg problem; you can't match against the full filename until you read the xattr, and you can't do that until you've lookup. The only way to do this would be to make an incompatible change to the directory layout. And doing this would require either refactoring and doing extensive rework of the code in fs/ext4/namei.c and fs/ext4/dir.c, to support both the the original v1 version of the directory layout, and the v2 version of the directory layout, as well as handling the v2 verison of the directory when it is encrypted. It's _doable_, but it's a huge amount of work. So the question is whether it's worth it. If this is some random class project where you don't care about bugs or reliability, that's one thing. If this is something that need to be hardened for production usage, it's quite a lot more work. Why are you interested in doing this? Is there business justification such that your company would be willing invest a significant amount of effort? Cheers, - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-10 23:24 ` Theodore Tso @ 2025-12-11 1:21 ` Winston Wen 2025-12-12 18:10 ` Andi Kleen 1 sibling, 0 replies; 8+ messages in thread From: Winston Wen @ 2025-12-11 1:21 UTC (permalink / raw) To: Theodore Tso; +Cc: linux-ext4 On Thu, 11 Dec 2025 08:24:59 +0900 "Theodore Tso" <tytso@mit.edu> wrote: Hi Ted, Thank you for your detailed and thoughtful response. Your insights are greatly appreciated. > On Wed, Dec 10, 2025 at 05:32:02PM +0800, Winston Wen wrote: > > To better understand, I’ve reviewed the readdir/getdents man pages > > and the glibc struct dirent definition. It appears that d_name is > > implemented as a flexible array member rather than a fixed-size > > array of 256 bytes. > > Intresting; the struture definition I quoted was from the readdir man > page. Indeed, the readdir.3 man page does show the definition you quoted. The actual implementation in glibc headers and the readdir.2/getdents.2 system call interface uses a flexible array, as you noted. > > I suspect there may be some number of random failures that might occur > because of hidden dependencies on the historical / traditional value > of NAME_MAX. For example, it might be OK for glibc; but what about > other C libraries that ship on Linux, such as musl, dietlibc, bionic, > etc.? > This is a valid concern. We have not tested with other C libraries such as musl, dietlibc, or bionic. In our use case, primarily file migration and document management operations, we have been using FUSE or wrapfs to mount specific directories on demand, rather than entire home directories. This has worked for our limited scope. Your point also reminds me that seeking universal support may not be the right path. NAME_MAX has been around for a long time, and many code paths may depend on it in ways we haven’t encountered yet. > > Going back to our original question: we were curious whether it > > might be possible to support longer filenames natively within ext4 > > itself (rather than through FUSE), perhaps via on-disk format > > extension or auxiliary storage like xattrs. If this is > > architecturally feasible, we would be very interested in exploring > > it further. > > Well, extended attributes won't work, because xattrs are associated > with the inode, not the directory entry. So you need to handle cases > where the file has multiple hard links. And if you are doing a lookup > by long file name, there's a chicken and egg problem; you can't match > against the full filename until you read the xattr, and you can't do > that until you've lookup. > Understood, that makes sense. Thank you for explaining the fundamental limitation with xattrs. > The only way to do this would be to make an incompatible change to the > directory layout. And doing this would require either refactoring and > doing extensive rework of the code in fs/ext4/namei.c and > fs/ext4/dir.c, to support both the the original v1 version of the > directory layout, and the v2 version of the directory layout, as well > as handling the v2 verison of the directory when it is encrypted. > It's _doable_, but it's a huge amount of work. So the question is > whether it's worth it. If this is some random class project where you > don't care about bugs or reliability, that's one thing. If this is > something that need to be hardened for production usage, it's quite a > lot more work. > I completely agree. Changing the on‑disk format would indeed be a massive undertaking and far beyond our current capacity. It would also introduce stability and compatibility risks that we are not prepared to take. Given that, we will likely not pursue this direction. > Why are you interested in doing this? Is there business justification > such that your company would be willing invest a significant amount of > effort? This is actually an interesting point. Personally, I’ve never hit this limit in my own Linux usage, 80 Chinese characters is far more than I typically need. However, many of our customers work with official documents (e.g., government or enterprise paperwork) where filenames regularly exceed this limit. So the requirement comes from real‑world business projects. That said, because the need is mostly confined to document directories, we can continue to use FUSE or wrapfs for targeted support, even though it introduces some usability overhead. It’s a workable solution for our specific scenario. > > Cheers, > > - Ted > Thank you again for taking the time to explain the technical and practical constraints. Your guidance has been very helpful. -- Thanks, Winston ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-10 23:24 ` Theodore Tso 2025-12-11 1:21 ` Winston Wen @ 2025-12-12 18:10 ` Andi Kleen 2025-12-12 23:35 ` Theodore Tso 1 sibling, 1 reply; 8+ messages in thread From: Andi Kleen @ 2025-12-12 18:10 UTC (permalink / raw) To: Theodore Tso; +Cc: Winston Wen, linux-ext4 "Theodore Tso" <tytso@mit.edu> writes: > > Well, extended attributes won't work, because xattrs are associated > with the inode, not the directory entry. So you need to handle cases > where the file has multiple hard links. And if you are doing a lookup > by long file name, there's a chicken and egg problem; you can't match > against the full filename until you read the xattr, and you can't do > that until you've lookup. Perhaps you could use xattrs on the directory inode to store the longer names, or the overflow. One problem is that they may need to be big, exceeding xattr limits, but perhaps some total limit on the longer file names would be acceptable. -Andi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-12 18:10 ` Andi Kleen @ 2025-12-12 23:35 ` Theodore Tso 2025-12-15 15:04 ` Andi Kleen 0 siblings, 1 reply; 8+ messages in thread From: Theodore Tso @ 2025-12-12 23:35 UTC (permalink / raw) To: Andi Kleen; +Cc: Winston Wen, linux-ext4 On Fri, Dec 12, 2025 at 10:10:36AM -0800, Andi Kleen wrote: > > Perhaps you could use xattrs on the directory inode to store the longer > names, or the overflow. > > One problem is that they may need to be big, exceeding xattr > limits, but perhaps some total limit on the longer file names > would be acceptable. With ext4, there is a limit of a single file system block for all extended attributes. You can store the value of extended attribute in an inode, in which case you only have the four byte inode number in the xattr block. But still, if you assume 16 bytes of overhead for each xattr entry, plus the xattr header, there's only room for 9 400 byte directory entries. And you wouldn't want to have a lot of directory entries stored in xattrs anyway, since searching them would have to be a brute force, O(n) search. You wouldn't be able to use a hash tree for fast lookups. So even if you didn't have the xattr limits, if you had a large number of very long file names stored in xattrs, it would be a performance disaster. Cheers, - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) 2025-12-12 23:35 ` Theodore Tso @ 2025-12-15 15:04 ` Andi Kleen 0 siblings, 0 replies; 8+ messages in thread From: Andi Kleen @ 2025-12-15 15:04 UTC (permalink / raw) To: Theodore Tso; +Cc: Winston Wen, linux-ext4 On Sat, Dec 13, 2025 at 08:35:37AM +0900, Theodore Tso wrote: > On Fri, Dec 12, 2025 at 10:10:36AM -0800, Andi Kleen wrote: > > > > Perhaps you could use xattrs on the directory inode to store the longer > > names, or the overflow. > > > > One problem is that they may need to be big, exceeding xattr > > limits, but perhaps some total limit on the longer file names > > would be acceptable. > > With ext4, there is a limit of a single file system block for all > extended attributes. You can store the value of extended attribute in With the bs>page size support this special use case could use larger blocks. > an inode, in which case you only have the four byte inode number in > the xattr block. But still, if you assume 16 bytes of overhead for > each xattr entry, plus the xattr header, there's only room for 9 400 > byte directory entries. You would only need to store the overflow, and I assume most uses would be much shorter anyways. But yes it would add some limit to the number of file names, but perhaps with a 64k and an average more toward 200 bytes it isn't that bad. > > And you wouldn't want to have a lot of directory entries stored in > xattrs anyway, since searching them would have to be a brute force, > O(n) search. You wouldn't be able to use a hash tree for fast > lookups. You could do the hash still on the original directory, and then perhaps use 4 bytes of the original file name to point to an xattr offset for the overflow. This would be incompatible of course, but not too far from original ext4. Perhaps it would also need something to have an efficient free list in the xattr, or maybe that could be just done in memory. -Andi ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-12-15 15:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-10 7:02 Inquiry: Possible built-in support for longer filenames in ext4 (beyond 256 bytes) Winston Wen 2025-12-10 9:05 ` Theodore Tso 2025-12-10 9:32 ` Winston Wen 2025-12-10 23:24 ` Theodore Tso 2025-12-11 1:21 ` Winston Wen 2025-12-12 18:10 ` Andi Kleen 2025-12-12 23:35 ` Theodore Tso 2025-12-15 15:04 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox