From: Chris Mason <chris.mason@oracle.com>
To: btrfs-devel@oss.oracle.com
Cc: Paul Collins <paul@burly.ondioline.org>, git@vger.kernel.org
Subject: Re: [Btrfs-devel] btrfs and git-reflog
Date: Fri, 25 Jan 2008 10:50:16 -0500 [thread overview]
Message-ID: <200801251050.16697.chris.mason@oracle.com> (raw)
In-Reply-To: <873asmcodd.fsf@burly.wgtn.ondioline.org>
On Friday 25 January 2008, Paul Collins wrote:
> I was just playing with git 1.5.3.8 and btrfs 0.11, and I noticed
> something odd.
>
> If I prepare a very simple repository:
>
> $ mkdir foo
> $ cd foo
> $ git init
> Initialized empty Git repository in .git/
> $ echo hi > blort
> $ git add .
> $ git commit -m create
> Created initial commit 4ae9415: create
> 1 files changed, 1 insertions(+), 0 deletions(-)
> create mode 100644 blort
>
> and then attempt to expire the reflogs
>
> $ git-reflog expire --all
>
> on ext3, git-reflog completes its work and exits immediately;
>
> and on btrfs, it gets stuck in some sort of loop that causes it to
> allocate more and more memory until I kill it or it pushes the
> machine into OOM.
>
It works something like this:
readdir(.git/logs/refs/heads)
# this returns .git/logs/refs/heads/master
# <do some work>
open(.git/logs/refs/heads/master.lock, O_CREAT);
# <do more work>, write to master.lock
rename(master.lock, master)
readdir(.git/logs/refs/heads)
readdir again returns .git/logs/refs/heads/master, which is arguably
correct. It is a new file that just happens to have a name
that git already saw. So, git loops over this file infinitely because
it doesn't realize it has already processed it.
This happens because btrfs doesn't return the hash of the
file name as the offset to readdir. It returns the inode number,
and since master is a new file, btrfs considers it a non-duplicate
entry.
The btrfs patch below changes my readdir code to force the
directory f_pos field to the max offset allowed when we've
seen all the directory entries. This prevents the readdir
call from looping forever in the face of newly added files.
But, git might want to add some checks to see if it has
already processed things.
diff -r 21e9b461f802 inode.c
--- a/inode.c Thu Jan 24 16:13:14 2008 -0500
+++ b/inode.c Fri Jan 25 10:28:49 2008 -0500
@@ -1430,7 +1431,7 @@ read_dir_items:
di = (struct btrfs_dir_item *)((char *)di + di_len);
}
}
- filp->f_pos++;
+ filp->f_pos = INT_LIMIT(typeof(filp->f_pos));
nopos:
ret = 0;
err:
next prev parent reply other threads:[~2008-01-25 15:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-25 8:15 btrfs and git-reflog Paul Collins
2008-01-25 9:50 ` Paul Collins
2008-01-25 15:01 ` [Btrfs-devel] " Chris Mason
2008-01-25 15:50 ` Chris Mason [this message]
2008-01-25 17:09 ` Linus Torvalds
2008-01-25 20:05 ` Junio C Hamano
2008-01-26 7:52 ` Junio C Hamano
2008-01-27 7:22 ` Linus Torvalds
2008-01-27 8:08 ` Junio C Hamano
2008-01-26 7:53 ` reflog-expire: Avoid creating new files in a directory inside readdir(3) loop Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801251050.16697.chris.mason@oracle.com \
--to=chris.mason@oracle.com \
--cc=btrfs-devel@oss.oracle.com \
--cc=git@vger.kernel.org \
--cc=paul@burly.ondioline.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).