All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Theodore Tso <tytso@mit.edu>,
	linux-kernel@vger.kernel.org, eugene@ibrix.com,
	msnitzer@ibrix.com, akpm@linux-foundation.org
Subject: Re: ext3: fix ext3_dx_readdir hash collision handling - Regression
Date: Fri, 24 Oct 2008 09:08:26 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.0810240853310.3287@nehalem.linux-foundation.org> (raw)
In-Reply-To: <20081024042851.GA2360@gentoox2.trippelsdorf.de>



On Fri, 24 Oct 2008, Markus Trippelsdorf wrote:
>
> Notice that I don't use "rm -rf" but "rm -r". The problem only occurs
> when I run "rm -r" without the "-f" switch.
> 
> The first time I encountered this problem was on an ext4 test partition
> a few days ago. When I tried to delete a big directory structure on that
> partition with mc (GNU Midnight Commander) by hitting F8 and then selecting
> "All" on the "Delete it recursively" popup, it just stopped deleting
> files after a while. So I had to repeat this procedure several times to
> get rid of the directory.

Hmm. I may actually have hit the same problem. I attributed it to a git 
buglet, and left it for later, because the last few days were pretty crazy 
(closing the merge window is when people always send their last-minute 
stuff even if they shouldn't, but to make things worse I was also gone for 
a day-and-a-half).

The "git buglet" looks liek this:

 - build a kernel

 - do "git clean -dqfx". This fails with

	warning: failed to remove 'include/config/'

 - do "git clean -dqfx" again. And now it works - apparently because the 
   first invocation deleted enough of the big directory.

Doing an 'strace' to see what is going on, I see:

	..
	getdents(3, /* 132 entries */, 4096)    = 3888
	lstat("include/config/sgi", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
	open("include/config/sgi", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = 4
	fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
	getdents(4, /* 3 entries */, 4096)      = 80
	lstat("include/config/sgi/partition.h", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
	unlink("include/config/sgi/partition.h") = 0
	getdents(4, /* 0 entries */, 4096)      = 0
	close(4)                                = 0
	rmdir("include/config/sgi")             = 0
	lstat("include/config/sgi", 0x7fffdc4d2110) = -1 ENOENT (No such file or directory)
	close(3)                                = 0
	write(2, "warning: failed to remove \'include/config/\'\n", 44) = 44
	..

and notice how it does that

	lstat("include/config/sgi", ..

_twice_. That, in turn (knowing the git implementation), implies that it 
was returned twice from that first "getdents(3, ...)" call.

So what git clean does is to scan over the readdir() return values, see if 
it's a file it knows about, try to remove it if not, lstat() every entry, 
recurse into them if they are directories, and then remove it. If the 
lstat() fails, "git clean" will fail.

So what happens is that it sees that "sgi" entry _twice_ in the readdir 
output, traverse into it once (and delete it), and the second time when it 
does an lstat() it will obviously fail (because now it's deleted!), and 
that will make "git clean" fail the whole thing due to an unexpected 
error.

And I _think_ that what brings this on is:

 - caring about the error value

   This is why "rm -rf" works for you, but "rm -r" does not. With "-f", rm 
   simply won't care. And like "rm -r", "git clean" cares about error 
   values.

 - large enough directories that you have *multiple* "getdents()" calls.

 - likely: removing files in between getdents() calls.

Hmm. Ted? I have not tried to revert that commit that Markus pinpointed 
(6a897cf447a83c9c3fd1b85a1e525c02d6eada7d: "ext3: fix ext3_dx_readdir hash 
collision handling"), but now that I look at that "git bug", I am getting 
pretty damn sure that it's exactly the same issue, and it's not a git bug 
at all.

Markus basically has exactly the same thing:

  rm: cannot remove `linux-2.6.27.2/arch/alpha/include/asm/statfs.h': No such file or directory

and that ENOENT is almost certainly because "rm" already removed that 
filename _once_, and it's the second time that fails.

And yes, that commit is all about returning directory entries twice. It 
claims to fix it, but I bet it breaks it.

		Linus

  parent reply	other threads:[~2008-10-24 16:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-22  9:32 ext3: fix ext3_dx_readdir hash collision handling - Regression Markus Trippelsdorf
2008-10-23  3:28 ` Theodore Tso
2008-10-23  6:37   ` Markus Trippelsdorf
2008-10-24  0:01     ` Theodore Tso
     [not found]       ` <20081024042851.GA2360@gentoox2.trippelsdorf.de>
2008-10-24 10:41         ` Theodore Tso
2008-10-24 16:08         ` Linus Torvalds [this message]
2008-10-24 20:44           ` Theodore Tso
2008-10-24 22:00           ` Re* " Junio C Hamano
2008-10-24 22:26             ` Linus Torvalds
2008-10-24 22:35               ` Junio C Hamano
2008-10-25 11:56           ` [PATCH] " Theodore Tso
2008-10-25 12:25             ` Markus Trippelsdorf
2008-10-25 21:15             ` Linus Torvalds
2008-10-26  2:39               ` [GIT PULL] " Theodore Tso
2008-10-26  2:42                 ` [PATCH 1/2] ext3: Fix duplicate entries returned from getdents() system call Theodore Ts'o
2008-10-26  2:42                   ` [PATCH 2/2] ext4: " Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0810240853310.3287@nehalem.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=eugene@ibrix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=msnitzer@ibrix.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.