From: Jeff King <peff@peff.net>
To: Andreas Mohr <andi@lisas.de>
Cc: Thomas Braun <thomas.braun@virtuell-zuhause.de>,
git@vger.kernel.org, msysGit <msysgit@googlegroups.com>
Subject: Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Date: Thu, 16 Apr 2015 11:28:50 -0400 [thread overview]
Message-ID: <20150416152849.GA30137@peff.net> (raw)
In-Reply-To: <20150416113505.GA30818@rhlx01.hs-esslingen.de>
On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote:
> I strongly suspect that git's repacking implementation
> (probably unrelated to msysgit-specific deviations,
> IOW, git *core* handling)
> simply is buggy
> in that it may keep certain file descriptors open
> at least a certain time (depending on scope of implementation/objects!?)
> beyond having finished its operation (rename?).
Hrm. I do not see anything in builtin/fetch.c that closes the packfile
descriptors before running "gc --auto". So basically the sequence:
1. Fetch performs actual fetch. It needs to open packfiles to do
commit negotiation with other side (the hard work is done
by an index-pack subprocess, but it is likely we have to access
_some_ objects).
2. The packfiles remain open and mmap'd (at least on Linux) in the
sha1_file.c:packed_git list.
3. We spawn "gc --auto" and wait for it to finish. While we are
waiting, the descriptors are still open, but "gc --auto" will not be
able to delete any packs.
But this seems too simple to be the problem, as it would mean that just
about any "gc --auto" that triggers a full repack would be a problem (so
anytime you have about 50 packs). But maybe the gc "autodetach" behavior
means it works racily.
I was able to set up the situation deterministically by running the
script below:
-- >8 --
#!/bin/sh
# XXX tweak this setting as appropriate
PATH_TO_GIT_BUILD=$HOME/compile/git
PATH=$PATH_TO_GIT_BUILD/bin-wrappers:$PATH
rm -rf parent child
# make a parent/child where the child will have to access
# a packfile to fulfill another fetch
git init parent &&
git -C parent commit --allow-empty -m base &&
git clone parent child &&
git -C parent commit --allow-empty -m extra &&
# we want to make our base pack really big, because otherwise
# git will open/mmap/close it. So we must exceed core.packedgitlimit
cd child &&
$PATH_TO_GIT_BUILD/test-genrandom foo 5000000 >file &&
git add file &&
git commit -m large file &&
git repack -ad &&
git config core.packedGitLimit 1M &&
# now make some spare packs to bust the gc.autopacklimit
for i in 1 2 3 4 5; do
git commit --allow-empty -m $i &&
git repack -d
done &&
git config gc.autoPackLimit 3 &&
git config gc.autoDetach false &&
GIT_TRACE=1 git fetch
```
I also instrumented my (v1.9.5) git build like this:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 025bc3e..fc99e5e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1174,6 +1174,12 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
list.strdup_strings = 1;
string_list_clear(&list, 0);
+ {
+ struct packed_git *p;
+ for (p = packed_git; p; p = p->next)
+ trace_printf("pack %s has descriptor %d\n",
+ p->pack_name, p->pack_fd);
+ }
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
return result;
diff --git a/builtin/repack.c b/builtin/repack.c
index bb2314c..e8b29cf 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -105,6 +105,7 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name)
for (i = 0; i < ARRAY_SIZE(exts); i++) {
strbuf_setlen(&buf, plen);
strbuf_addstr(&buf, exts[i]);
+ trace_printf("unlinking %s\n", buf.buf);
unlink(buf.buf);
}
strbuf_release(&buf);
to confirm what was happening (because of course on Linux it is
perfectly fine to delete the open file). If this does trigger the bug
for you, though, it should be obvious even without the trace calls. :)
-Peff
--
--
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.
You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en
---
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
next prev parent reply other threads:[~2015-04-16 15:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-16 10:03 Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues Andreas Mohr
2015-04-16 11:10 ` Thomas Braun
2015-04-16 11:31 ` Johannes Schindelin
2015-04-16 11:42 ` Andreas Mohr
2015-04-16 11:48 ` Andreas Mohr
2015-04-16 12:35 ` Andreas Mohr
2015-04-16 13:07 ` Johannes Schindelin
2015-04-16 11:35 ` Andreas Mohr
2015-04-16 15:28 ` Jeff King [this message]
2015-04-16 15:48 ` Johannes Schindelin
2015-04-16 15:56 ` David Miller
2015-04-16 20:56 ` Andreas Mohr
2015-04-23 6:52 ` rupert thurner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150416152849.GA30137@peff.net \
--to=peff@peff.net \
--cc=andi@lisas.de \
--cc=git@vger.kernel.org \
--cc=msysgit@googlegroups.com \
--cc=thomas.braun@virtuell-zuhause.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).