git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs and git-reflog
@ 2008-01-25  8:15 Paul Collins
  2008-01-25  9:50 ` Paul Collins
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Paul Collins @ 2008-01-25  8:15 UTC (permalink / raw)
  To: btrfs-devel; +Cc: git

I was just playing with git 1.5.3.8 and btrfs 0.11, and I noticed
something odd.

If I prepare a very simple repository:

  $ mkdir foo
  $ cd foo
  $ git init
  Initialized empty Git repository in .git/
  $ echo hi > blort
  $ git add .
  $ git commit -m create
  Created initial commit 4ae9415: create
   1 files changed, 1 insertions(+), 0 deletions(-)
   create mode 100644 blort

and then attempt to expire the reflogs

  $ git-reflog --expire --all

on ext3, git-reflog completes its work and exits immediately;

and on btrfs, it gets stuck in some sort of loop that causes it to
allocate more and more memory until I kill it or it pushes the
machine into OOM.

Kernel is 2.6.24 or so on x86-64.

-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs and git-reflog
  2008-01-25  8:15 btrfs and git-reflog Paul Collins
@ 2008-01-25  9:50 ` Paul Collins
  2008-01-25 15:01 ` [Btrfs-devel] " Chris Mason
  2008-01-25 15:50 ` Chris Mason
  2 siblings, 0 replies; 10+ messages in thread
From: Paul Collins @ 2008-01-25  9:50 UTC (permalink / raw)
  To: btrfs-devel; +Cc: git

Correction.

Paul Collins <paul@burly.ondioline.org> writes:

> and then attempt to expire the reflogs
>
>   $ git-reflog --expire --all

  $ git-reflog expire --all

-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-25  8:15 btrfs and git-reflog Paul Collins
  2008-01-25  9:50 ` Paul Collins
@ 2008-01-25 15:01 ` Chris Mason
  2008-01-25 15:50 ` Chris Mason
  2 siblings, 0 replies; 10+ messages in thread
From: Chris Mason @ 2008-01-25 15:01 UTC (permalink / raw)
  To: btrfs-devel; +Cc: Paul Collins, git

On Friday 25 January 2008, Paul Collins wrote:
> I was just playing with git 1.5.3.8 and btrfs 0.11, and I noticed
> something odd.
>
> If I prepare a very simple repository:
>
>   $ mkdir foo
>   $ cd foo
>   $ git init
>   Initialized empty Git repository in .git/
>   $ echo hi > blort
>   $ git add .
>   $ git commit -m create
>   Created initial commit 4ae9415: create
>    1 files changed, 1 insertions(+), 0 deletions(-)
>    create mode 100644 blort
>
> and then attempt to expire the reflogs
>
>   $ git-reflog --expire --all
>
> on ext3, git-reflog completes its work and exits immediately;

Strange, but I can reproduce here.  I'll take a look, thanks for this report.

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-25  8:15 btrfs and git-reflog Paul Collins
  2008-01-25  9:50 ` Paul Collins
  2008-01-25 15:01 ` [Btrfs-devel] " Chris Mason
@ 2008-01-25 15:50 ` Chris Mason
  2008-01-25 17:09   ` Linus Torvalds
  2 siblings, 1 reply; 10+ messages in thread
From: Chris Mason @ 2008-01-25 15:50 UTC (permalink / raw)
  To: btrfs-devel; +Cc: Paul Collins, git

On Friday 25 January 2008, Paul Collins wrote:
> I was just playing with git 1.5.3.8 and btrfs 0.11, and I noticed
> something odd.
>
> If I prepare a very simple repository:
>
>   $ mkdir foo
>   $ cd foo
>   $ git init
>   Initialized empty Git repository in .git/
>   $ echo hi > blort
>   $ git add .
>   $ git commit -m create
>   Created initial commit 4ae9415: create
>    1 files changed, 1 insertions(+), 0 deletions(-)
>    create mode 100644 blort
>
> and then attempt to expire the reflogs
>
>   $ git-reflog expire --all
>
> on ext3, git-reflog completes its work and exits immediately;
>
> and on btrfs, it gets stuck in some sort of loop that causes it to
> allocate more and more memory until I kill it or it pushes the
> machine into OOM.
>
It works something like this:

readdir(.git/logs/refs/heads)
# this returns .git/logs/refs/heads/master
# <do some work>

open(.git/logs/refs/heads/master.lock, O_CREAT);
# <do more work>, write to master.lock

rename(master.lock, master)

readdir(.git/logs/refs/heads)

readdir again returns .git/logs/refs/heads/master, which is arguably
correct.  It is a new file that just happens to have a name
that git already saw.  So, git loops over this file infinitely because
it doesn't realize it has already processed it.

This happens because btrfs doesn't return the hash of the
file name as the offset to readdir.  It returns the inode number,
and since master is a new file, btrfs considers it a non-duplicate
entry.

The btrfs patch below changes my readdir code to force the
directory f_pos field to the max offset allowed when we've
seen all the directory entries.  This prevents the readdir
call from looping forever in the face of newly added files.

But, git might want to add some checks to see if it has
already processed things.

diff -r 21e9b461f802 inode.c
--- a/inode.c	Thu Jan 24 16:13:14 2008 -0500
+++ b/inode.c	Fri Jan 25 10:28:49 2008 -0500
@@ -1430,7 +1431,7 @@ read_dir_items:
 			di = (struct btrfs_dir_item *)((char *)di + di_len);
 		}
 	}
-	filp->f_pos++;
+	filp->f_pos = INT_LIMIT(typeof(filp->f_pos));
 nopos:
 	ret = 0;
 err:

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-25 15:50 ` Chris Mason
@ 2008-01-25 17:09   ` Linus Torvalds
  2008-01-25 20:05     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2008-01-25 17:09 UTC (permalink / raw)
  To: Chris Mason; +Cc: btrfs-devel, Paul Collins, git



On Fri, 25 Jan 2008, Chris Mason wrote:
> 
> The btrfs patch below changes my readdir code to force the
> directory f_pos field to the max offset allowed when we've
> seen all the directory entries.  This prevents the readdir
> call from looping forever in the face of newly added files.

I think such a change may be sensible if we also taught the Linux VFS 
layer about it (so that we could avoid the costly parts of readdir() 
getting a semaphore etc the next call), but in this particular format it's 
just an ugly hack for what is a git bug.

> But, git might want to add some checks to see if it has
> already processed things.

Yes indeed. This is clearly git mis-using "readdir()", and should be 
fixed. It could happen on other filesystems, and very much including ones 
where there is no option of just "fixing" the filesystem (including 
Linux: not everybody can upgrade their kernels just because git makes 
some broken assumptions).

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-25 17:09   ` Linus Torvalds
@ 2008-01-25 20:05     ` Junio C Hamano
  2008-01-26  7:52       ` Junio C Hamano
  2008-01-26  7:53       ` reflog-expire: Avoid creating new files in a directory inside readdir(3) loop Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2008-01-25 20:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Chris Mason, btrfs-devel, Paul Collins, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Yes indeed. This is clearly git mis-using "readdir()", and should be 
> fixed. It could happen on other filesystems, and very much including ones 
> where there is no option of just "fixing" the filesystem (including 
> Linux: not everybody can upgrade their kernels just because git makes 
> some broken assumptions).

I agree that this is a broken assumption on git's part.  When we
have this loop:

	while (ent = readdir()) {
        	... do something in that directory ...
	}

we should expect readdir() may return new entries and cope with
it, as "If a file is removed from or added to the directory
after the most recent call to opendir() or rewinddir(), whether
a subsequent call to readdir() returns an entry for that file is
unspecified." [*1*]

The modification may not be done by us, but by "git push" from
elsewhere to update an existing ref, which can run while we are
inside this loop.

For this particular case, I think keeping track of what we have
already dealt with and skipping when we see duplicates is a good
enough solution.  Removal of a loose ref is protected by taking
a lock on the packed-refs file, so we shouldn't have to worry
about races between pack-refs and a push to delete a ref (IOW,
it won't be an issue that readdir(3) may report a file that is
removd by simultaneous git-push).

However, I am not sure if the way delete_ref() takes locks (one
for the ref itself and another for packed-refs file) is
deadlock-free.  That may need an independent fix if there is; I
haven't closely looked at this.

I suspect we might want to change the current ".lock" suffix to
something that would make the lock filenames an invalid ref
(e.g., "..lock", or "~"), so that people can use "foo.lock" as a
branch name.  That is also outside of the scope of readdir(3)
abuse fix, though.


[Reference]

*1* http://www.opengroup.org/onlinepubs/000095399/functions/readdir.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-25 20:05     ` Junio C Hamano
@ 2008-01-26  7:52       ` Junio C Hamano
  2008-01-27  7:22         ` Linus Torvalds
  2008-01-26  7:53       ` reflog-expire: Avoid creating new files in a directory inside readdir(3) loop Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2008-01-26  7:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Chris Mason, btrfs-devel, Paul Collins, git

Junio C Hamano <gitster@pobox.com> writes:

> I agree that this is a broken assumption on git's part.  When we
> have this loop:
>
> 	while (ent = readdir()) {
>         	... do something in that directory ...
> 	}
>
> we should expect readdir() may return new entries and cope with
> it, as "If a file is removed from or added to the directory
> after the most recent call to opendir() or rewinddir(), whether
> a subsequent call to readdir() returns an entry for that file is
> unspecified." [*1*]

Here is a result of a mini-audit.

* builtin-prune-packed.c::prune_dir() loops and unlinks (some
  of) returned paths while in the loop.  This should not
  interfere with readdir(3).  I am presuming that we can declare
  readdir(3) implementation buggy if this happens:

	* opendir();
	* readdir() gives $P;
        * unlink($P);
        * readdir() later gives $P again.

  Otherwise we need to lose check for return value from
  unlink().

* builtin-prune.c::prune_dir() has a similar construct and the
  same (non-)issue.

* dir.c::remove_dir_recursively() -- likewise.

* entry.c::remove_subtree() -- likewise.  We might want to unify
  this with the previous one.

A patch to "reflog-expire --all" will follow in a separate
message.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* reflog-expire: Avoid creating new files in a directory inside readdir(3) loop
  2008-01-25 20:05     ` Junio C Hamano
  2008-01-26  7:52       ` Junio C Hamano
@ 2008-01-26  7:53       ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2008-01-26  7:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Chris Mason, btrfs-devel, Paul Collins, git

"git reflog expire --all" opened a directory in $GIT_DIR/logs/,
read reflog files in there readdir(3), and rewrote the file by
creating a new file and renaming it back inside the loop.  This
code structure can cause the newly created file to be returned
by subsequent call to readdir(3), and fall into an infinite loop
in the worst case.

This separates the processing to two phase.  Running
for_each_reflog() to find out and collect all refs, and then
iterate over them, calling expire_reflog().  This way, the
program would behave exactly the same way as if all the refs
were given by the user from the command line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 builtin-reflog.c |   40 ++++++++++++++++++++++++++++++++++++++--
 1 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/builtin-reflog.c b/builtin-reflog.c
index ce093ca..e6834dd 100644
--- a/builtin-reflog.c
+++ b/builtin-reflog.c
@@ -34,6 +34,16 @@ struct expire_reflog_cb {
 	struct cmd_reflog_expire_cb *cmd;
 };
 
+struct collected_reflog {
+	unsigned char sha1[20];
+	char reflog[FLEX_ARRAY];
+};
+struct collect_reflog_cb {
+	struct collected_reflog **e;
+	int alloc;
+	int nr;
+};
+
 #define INCOMPLETE	(1u<<10)
 #define STUDYING	(1u<<11)
 
@@ -281,6 +291,20 @@ static int expire_reflog(const char *ref, const unsigned char *sha1, int unused,
 	return status;
 }
 
+static int collect_reflog(const char *ref, const unsigned char *sha1, int unused, void *cb_data)
+{
+	struct collected_reflog *e;
+	struct collect_reflog_cb *cb = cb_data;
+	size_t namelen = strlen(ref);
+
+	e = xmalloc(sizeof(*e) + namelen + 1);
+	hashcpy(e->sha1, sha1);
+	memcpy(e->reflog, ref, namelen + 1);
+	ALLOC_GROW(cb->e, cb->nr + 1, cb->alloc);
+	cb->e[cb->nr++] = e;
+	return 0;
+}
+
 static int reflog_expire_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "gc.reflogexpire"))
@@ -349,8 +373,20 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			putchar('\n');
 	}
 
-	if (do_all)
-		status |= for_each_reflog(expire_reflog, &cb);
+	if (do_all) {
+		struct collect_reflog_cb collected;
+		int i;
+
+		memset(&collected, 0, sizeof(collected));
+		for_each_reflog(collect_reflog, &collected);
+		for (i = 0; i < collected.nr; i++) {
+			struct collected_reflog *e = collected.e[i];
+			status |= expire_reflog(e->reflog, e->sha1, 0, &cb);
+			free(e);
+		}
+		free(collected.e);
+	}
+
 	while (i < argc) {
 		const char *ref = argv[i++];
 		unsigned char sha1[20];

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-26  7:52       ` Junio C Hamano
@ 2008-01-27  7:22         ` Linus Torvalds
  2008-01-27  8:08           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2008-01-27  7:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Mason, btrfs-devel, Paul Collins, git



On Fri, 25 Jan 2008, Junio C Hamano wrote:
> 
> * builtin-prune-packed.c::prune_dir() loops and unlinks (some
>   of) returned paths while in the loop.  This should not
>   interfere with readdir(3).

Correct. The "loop and unlink" behaviour is common and accepted practice, 
and is, for example, what things like "rm -rf" tend to do. 

It's only newly created files that may or may not show up in readdir() 
after they've been created, regardless of whether they were created under 
a name that was previously existed.

>   I am presuming that we can declare
>   readdir(3) implementation buggy if this happens:
> 
> 	* opendir();
> 	* readdir() gives $P;
>         * unlink($P);
>         * readdir() later gives $P again.

Yes, but the potential problem is actually very different:

 - directory contains 'a', 'b' and 'c'

	* opendir()
	* readdir() returns 'a'
	  * unlink('a');
	* readdir() returns 'c', having skipped 'b'.

This is something that could in theory happen if a directory is indexed 
using the *position* of a filename in a directory. 'a' was position 1, 'b' 
was position 2, and 'c' was position 3. After the first readdir(), the 
file position was 2 (pointing at 'b'), but when we removed 'a', the other 
entries positions moved down, and now 'b' is at position 1, and 'c' is at 
position '2'. When we call readdir() the next time, it skips 'b' (because 
it already returned position 1!), and returns 'c'.

See? It's basically the issue of removing an entry in a linked list. And 
yes, we've actually had bugs like that in our in-memory tmpfs. But they 
were bugs, and they got fixed, and that's not how readdir() is supposed to 
work, exactly because it makes doing an "rm -rf" very hard.

So readdir() should basically be safe wrt 'unlink()' happening while the 
readdir() is running, and the only entry that can disappear is the one 
that is unlinekd.

So don't worry about unlink's, it's only new files being created in the 
same directory that can be problematic in that you don't know if 
readdir() will return them or not..

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Btrfs-devel] btrfs and git-reflog
  2008-01-27  7:22         ` Linus Torvalds
@ 2008-01-27  8:08           ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2008-01-27  8:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Chris Mason, btrfs-devel, Paul Collins, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Yes, but the potential problem is actually very different:
>
>  - directory contains 'a', 'b' and 'c'
>
> 	* opendir()
> 	* readdir() returns 'a'
> 	  * unlink('a');
> 	* readdir() returns 'c', having skipped 'b'.
>
> This is something that could in theory happen if a directory is indexed 
> using the *position* of a filename in a directory. 'a' was position 1, 'b' 
> was position 2, and 'c' was position 3. After the first readdir(), the 
> file position was 2 (pointing at 'b'), but when we removed 'a', the other 
> entries positions moved down, and now 'b' is at position 1, and 'c' is at 
> position '2'. When we call readdir() the next time, it skips 'b' (because 
> it already returned position 1!), and returns 'c'.
>
> See?

Yes, I had that one in mind too, but that will cause us to miss
'b' and its only effect is to leave 'b' unpacked.  If we somehow
dropped 'b' and lost information that is a different story, but
I do not think this failure mode would cause our current code to
do so.  So I do not think is a big deal even if some filesystem
had a bug like that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-01-27  8:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-25  8:15 btrfs and git-reflog Paul Collins
2008-01-25  9:50 ` Paul Collins
2008-01-25 15:01 ` [Btrfs-devel] " Chris Mason
2008-01-25 15:50 ` Chris Mason
2008-01-25 17:09   ` Linus Torvalds
2008-01-25 20:05     ` Junio C Hamano
2008-01-26  7:52       ` Junio C Hamano
2008-01-27  7:22         ` Linus Torvalds
2008-01-27  8:08           ` Junio C Hamano
2008-01-26  7:53       ` reflog-expire: Avoid creating new files in a directory inside readdir(3) loop Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).