* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. [not found] <20041116183813.11cbf280.akpm@osdl.org> @ 2004-11-17 22:34 ` Theodore Ts'o 2004-11-18 0:00 ` Stephen C. Tweedie 0 siblings, 1 reply; 7+ messages in thread From: Theodore Ts'o @ 2004-11-17 22:34 UTC (permalink / raw) To: Andrew Morton Cc: r6144, linux-kernel, ext2-devel, phillips, Alex Tomas, Christopher Li, Christopher Li, Stephen C. Tweedie > [1.] One line summary of the problem: > telldir() broken on large ext3 dir_index'd directories because > getdents() gives d_off==0 for the first entry Here's a patch which fixes the problem, but note the following warning from the readdir man page: According to POSIX, the dirent structure contains a field char d_name[] of unspecified size, with at most NAME_MAX characters preceding the terminating null character. Use of other fields will harm the porta- bility of your programs. Also, as always, telldir() and seekdir() are truly awful interfaces because they implicitly assume that (a) a directory is a linear data structure, and (b) that the position in a directory can be expressed in a cookie which hsa only 31 bits on 32-bit systems. So there will be hash colliions that will cause programs that assume that seekdir(dirent->d_off) will always return the next directory entry to sometimes lose directory entries in the not-as-unlikely-as-we-would wish case of a 31-bit hash collision. Really, any program which is using telldir/seekdir really should be rewritten to not use these interfaces if at all possible. So with these caveats.... - Ted Here is a patch which causes d_off of '.' to be 1, and for seekdir(1) to cause readdir to return the directory entry of '..'. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> ===== fs/ext3/namei.c 1.59 vs edited ===== --- 1.59/fs/ext3/namei.c 2004-10-19 05:40:30 -04:00 +++ edited/fs/ext3/namei.c 2004-11-17 17:14:06 -05:00 @@ -610,10 +610,15 @@ int ext3_htree_fill_tree(struct file *di de = (struct ext3_dir_entry_2 *) frames[0].bh->b_data; if ((err = ext3_htree_store_dirent(dir_file, 0, 0, de)) != 0) goto errout; + count++; + start_hash=2; + } + if (start_hash==2 && start_minor_hash==0) { + de = (struct ext3_dir_entry_2 *) frames[0].bh->b_data; de = ext3_next_entry(de); - if ((err = ext3_htree_store_dirent(dir_file, 0, 0, de)) != 0) + if ((err = ext3_htree_store_dirent(dir_file, 2, 0, de)) != 0) goto errout; - count += 2; + count++; } while (1) { ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-17 22:34 ` Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry Theodore Ts'o @ 2004-11-18 0:00 ` Stephen C. Tweedie 2004-11-18 4:53 ` Theodore Ts'o 0 siblings, 1 reply; 7+ messages in thread From: Stephen C. Tweedie @ 2004-11-18 0:00 UTC (permalink / raw) To: Theodore Ts'o Cc: Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li, Stephen Tweedie Hi, On Wed, 2004-11-17 at 22:34, Theodore Ts'o wrote: > Here is a patch which causes d_off of '.' to be 1, and for seekdir(1) > to cause readdir to return the directory entry of '..'. Doesn't this make things worse? The old problem was that seekdir/telldir were broken (which we already knew for certain cases like hash collisions). But with... > + start_hash=2; don't we end up silently ignoring all dirents with a major hash <= 1, even for unbroken getdents() with no intervening seekdir? Previously we'd at least fill them into the rbtree, so sequential readdir would find them even if the hash collided. Now we'll skip them entirely. If we're going to do this, I think we need to stuff . and .. into the rbtree with the right hashes, but without ignoring other existing dirents with colliding hashes. --Stephen ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-18 0:00 ` Stephen C. Tweedie @ 2004-11-18 4:53 ` Theodore Ts'o 2004-11-18 11:22 ` Jan Engelhardt 2004-11-18 15:37 ` Stephen C. Tweedie 0 siblings, 2 replies; 7+ messages in thread From: Theodore Ts'o @ 2004-11-18 4:53 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li On Thu, Nov 18, 2004 at 12:00:05AM +0000, Stephen C. Tweedie wrote: > > Doesn't this make things worse? > don't we end up silently ignoring all dirents with a major hash <= 1, > even for unbroken getdents() with no intervening seekdir? Oops, yes, I screwed up. > If we're going to do this, I think we need to stuff . and .. into the > rbtree with the right hashes, but without ignoring other existing > dirents with colliding hashes. We can't just do that, because there are programs that's assume '.' and '..' are the first and second entries in the directory. Yes, they are broken and non-portable, but so are programs that depend on d_off.... So instead what we need to do is wire '.' and '..' to have hash values of (0,0) and (2,0), respectively, without ignoring other existing dirents with colliding hashes. (In those cases the programs will break, but they are statistically rare, and there's not much we can do in those cases anyway.) This patch should do this. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- 1.59/fs/ext3/namei.c 2004-10-19 05:40:30 -04:00 +++ edited/fs/ext3/namei.c 2004-11-17 23:05:35 -05:00 @@ -610,10 +610,14 @@ int ext3_htree_fill_tree(struct file *di de = (struct ext3_dir_entry_2 *) frames[0].bh->b_data; if ((err = ext3_htree_store_dirent(dir_file, 0, 0, de)) != 0) goto errout; + count++; + } + if (start_hash < 2 || (start_hash ==2 && start_minor_hash==0)) { + de = (struct ext3_dir_entry_2 *) frames[0].bh->b_data; de = ext3_next_entry(de); - if ((err = ext3_htree_store_dirent(dir_file, 0, 0, de)) != 0) + if ((err = ext3_htree_store_dirent(dir_file, 2, 0, de)) != 0) goto errout; - count += 2; + count++; } while (1) { ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-18 4:53 ` Theodore Ts'o @ 2004-11-18 11:22 ` Jan Engelhardt 2004-11-18 14:06 ` Theodore Ts'o 2004-11-18 15:37 ` Stephen C. Tweedie 1 sibling, 1 reply; 7+ messages in thread From: Jan Engelhardt @ 2004-11-18 11:22 UTC (permalink / raw) To: Theodore Ts'o Cc: Stephen C. Tweedie, Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li >So instead what we need to do is wire '.' and '..' to have hash values >of (0,0) and (2,0), respectively, without ignoring other existing >dirents with colliding hashes. (In those cases the programs will >break, but they are statistically rare, and there's not much we can do >in those cases anyway.) IMO it's better to fix the mess all at once to have it weeded out for some months. Jan Engelhardt -- Gesellschaft für Wissenschaftliche Datenverarbeitung Am Fassberg, 37077 Göttingen, www.gwdg.de ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-18 11:22 ` Jan Engelhardt @ 2004-11-18 14:06 ` Theodore Ts'o 2004-11-18 15:40 ` Jan Engelhardt 0 siblings, 1 reply; 7+ messages in thread From: Theodore Ts'o @ 2004-11-18 14:06 UTC (permalink / raw) To: Jan Engelhardt Cc: Stephen C. Tweedie, Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li On Thu, Nov 18, 2004 at 12:22:38PM +0100, Jan Engelhardt wrote: > >So instead what we need to do is wire '.' and '..' to have hash values > >of (0,0) and (2,0), respectively, without ignoring other existing > >dirents with colliding hashes. (In those cases the programs will > >break, but they are statistically rare, and there's not much we can do > >in those cases anyway.) > > IMO it's better to fix the mess all at once to have it weeded out for some > months. Programs that assume that '.' and '..' are the first and second entries of a directory are intrinsically broken; POSIX never guaranteed this to be the case. Unfortunately, historically things have always worked that way, and so there may be some broken programs lurking out there. But there's really not much we can do. Before, we hard-wired '.' and '..' to always be first, at the cost of breaking programs that used the (broken by design) POSIX telldir/seekdir interfaces. Since telldir/seekdir, however badly designed, are part of POSIX, it seems appropriate to let those programs work, but the cost is a statistical probability that programs making assumptions about the order of '.' and '..' will break. We don't really have a choice here. (Actually, I guess we could define a new hash function that never produces certain hash values, but that would break compatibility with all existing deployed filesystems that use ext3 htree. That's not an option, either. So again, making a best effort, but breaking programs that are fundamentally broken is the best we can do.) - Ted ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-18 14:06 ` Theodore Ts'o @ 2004-11-18 15:40 ` Jan Engelhardt 0 siblings, 0 replies; 7+ messages in thread From: Jan Engelhardt @ 2004-11-18 15:40 UTC (permalink / raw) To: Theodore Ts'o Cc: Stephen C. Tweedie, Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li >> >So instead what we need to do is wire '.' and '..' to have hash values >> >of (0,0) and (2,0), respectively, without ignoring other existing >> >dirents with colliding hashes. (In those cases the programs will >> >break, but they are statistically rare, and there's not much we can do >> >in those cases anyway.) >> >> IMO it's better to fix the mess all at once to have it weeded out for some >> months. > >Programs that assume that '.' and '..' are the first and second [...] >But there's really not much we can do. > >Before, we hard-wired '.' and '..' to always be first [...] >We don't really have a choice here. > >(Actually, I guess we could define a new hash function [...] >So again, making a best effort, but breaking programs >that are fundamentally broken is the best we can do.) Looks like there's only way -- or: two ways and one choice: - Either leave it all as it is ATM (kind of unsatisfying) or - (unintentionally) break all apps now, or announce that they could brake if they relied on special features (as outlined above), and at the same time implement proper directory traversal. Jan Engelhardt -- Gesellschaft für Wissenschaftliche Datenverarbeitung Am Fassberg, 37077 Göttingen, www.gwdg.de ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry. 2004-11-18 4:53 ` Theodore Ts'o 2004-11-18 11:22 ` Jan Engelhardt @ 2004-11-18 15:37 ` Stephen C. Tweedie 1 sibling, 0 replies; 7+ messages in thread From: Stephen C. Tweedie @ 2004-11-18 15:37 UTC (permalink / raw) To: Theodore Ts'o Cc: Andrew Morton, r6144, linux-kernel, ext2-devel@lists.sourceforge.net, phillips, Alex Tomas, Christopher Li, Christopher Li, Stephen Tweedie Hi, On Thu, 2004-11-18 at 04:53, Theodore Ts'o wrote: > > If we're going to do this, I think we need to stuff . and .. into the > > rbtree with the right hashes, but without ignoring other existing > > dirents with colliding hashes. > > We can't just do that, because there are programs that's assume '.' > and '..' are the first and second entries in the directory. Yes, they > are broken and non-portable, but so are programs that depend on > d_off.... Sorry, by "right hashes" I meant adding them with hashes 0 and 2 but in the correct place in the stream; so if we do have a hash collision on major-hash==0, we'll get ".." slightly out of order, but will still correctly provide all the entries. And your second patch seems to do exactly that. Thanks! > This patch should do this. Looks good to me --- have you tested it much? The only remaining thing I can think of is what happens if the while() loop at the end of ext3_htree_fill_tree() exits successfully after filling in hash-major==0. Then we'll restart at 2 next time, and will return ".." twice. I'm not sure that's actually possible, though. Moving the filling-in of ".." into the while loop would deal with this rare possibility. --Stephen ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-11-18 15:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20041116183813.11cbf280.akpm@osdl.org>
2004-11-17 22:34 ` Fw: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry Theodore Ts'o
2004-11-18 0:00 ` Stephen C. Tweedie
2004-11-18 4:53 ` Theodore Ts'o
2004-11-18 11:22 ` Jan Engelhardt
2004-11-18 14:06 ` Theodore Ts'o
2004-11-18 15:40 ` Jan Engelhardt
2004-11-18 15:37 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox