public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Mike Snitzer" <snitzer@gmail.com>
To: "Theodore Tso" <tytso@mit.edu>
Cc: "Thomas Trauner" <tt@it-austria.net>,
	linux-ext4@vger.kernel.org, ext3-users <ext3-users@redhat.com>
Subject: Re: duplicate entries on ext3 when using readdir/readdir64
Date: Wed, 13 Aug 2008 17:21:20 -0400	[thread overview]
Message-ID: <170fa0d20808131421j3e4955dcra611509f1a094547@mail.gmail.com> (raw)
In-Reply-To: <20080806140722.GA14109@mit.edu>

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

On Wed, Aug 6, 2008 at 10:07 AM, Theodore Tso <tytso@mit.edu> wrote:
> On Tue, Aug 05, 2008 at 12:53:51PM +0200, Thomas Trauner wrote:
>> On Fri, 2008-08-01 at 08:16 -0400, Theodore Tso wrote:
>> > On Fri, Aug 01, 2008 at 11:43:40AM +0200, Thomas Trauner wrote:
>> > >
>> > > I have a problem with directories that contain more than 10000 entries
>> > > (Ubuntu 8.04.1) or with more than 70000 entries (RHEL 5.2). If you use
>> > > readdir(3) or readdir64(3) you get one entry twice, with same name and
>> > > inode.
>> > >
>> I made new tests with the code under
>> <http://www.unet.univie.ac.at/~a9100884/readdir.c> on a bunch of freshly
>> generated and empty filesystems, every about 38GB large, of type fat
>> (aborted after about 22000 entries because it took to long), ext2, xfs,
>> jfs and again ext3....
>
> OK, I have a workaroud for you.  It appears there's a kernel bug
> hiding here, since there shouldn't be duplicates returned by readdir()
> even if we have hash collisions.

Ted,

The attached patch has served my employer (IBRIX) well for 2.5 years.
It was only recently, when I re-raised this issue internally based on
this thread, that a co-worker recalled the fix.

regards,
Mike

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ext3_dx_readdir_hash_collision_fix.patch --]
[-- Type: text/x-patch; name=ext3_dx_readdir_hash_collision_fix.patch, Size: 1400 bytes --]

diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c
index 2eea96e..42c5391 100644
--- a/fs/ext3/dir.c
+++ b/fs/ext3/dir.c
@@ -410,7 +410,7 @@ static int call_filldir(struct file * filp, void * dirent,
 				get_dtype(sb, fname->file_type));
 		if (error) {
 			filp->f_pos = curr_pos;
-			info->extra_fname = fname->next;
+			info->extra_fname = fname;
 			return error;
 		}
 		fname = fname->next;
@@ -449,11 +449,21 @@ static int ext3_dx_readdir(struct file * filp,
 	 * If there are any leftover names on the hash collision
 	 * chain, return them first.
 	 */
-	if (info->extra_fname &&
-	    call_filldir(filp, dirent, filldir, info->extra_fname))
-		goto finished;
-
-	if (!info->curr_node)
+	if (info->extra_fname) {
+                if (call_filldir(filp, dirent, filldir, info->extra_fname))
+                        goto finished;
+
+                info->extra_fname = NULL;
+                info->curr_node = rb_next(info->curr_node);
+                if (!info->curr_node) {
+                        if (info->next_hash == ~0) {
+                                filp->f_pos = EXT3_HTREE_EOF;
+                                goto finished;
+                        }
+                        info->curr_hash = info->next_hash;
+                        info->curr_minor_hash = 0;
+                }
+        } else if (!info->curr_node)
 		info->curr_node = rb_first(&info->root);
 
 	while (1) {

  parent reply	other threads:[~2008-08-13 21:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1217583820.12454.20.camel@kannnix.a2x.lan.at>
     [not found] ` <20080801121658.GG8736@mit.edu>
     [not found]   ` <1217933631.14552.45.camel@kannnix.a2x.lan.at>
2008-08-06 14:07     ` duplicate entries on ext3 when using readdir/readdir64 Theodore Tso
2008-08-06 15:14       ` Thomas Trauner
2008-08-13 21:21       ` Mike Snitzer [this message]
2008-08-14  2:58         ` Theodore Tso
2008-08-14 23:27           ` Mike Snitzer
2008-08-14 14:52         ` Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=170fa0d20808131421j3e4955dcra611509f1a094547@mail.gmail.com \
    --to=snitzer@gmail.com \
    --cc=ext3-users@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tt@it-austria.net \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox