From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Johannes Sixt <j6t@kdbg.org>
Subject: [PATCH 0/4] fix hang in git push when pack-objects fails
Date: Thu, 31 Mar 2011 14:42:44 -0400 [thread overview]
Message-ID: <20110331184243.GA12027@sigill.intra.peff.net> (raw)
When we push over the git protocol, we spawn a pack-object process. If
pack-objects dies prematurely (for example, because there is some repo
corruption), we are careful to clean up the sideband demuxer (if it is
being used) with finish_async. However, for an async implementation
which forks (i.e., when we have no pthreads), that means we will
waitpid() for the async process.
Meanwhile, the async sideband demuxer will continue trying to stream
data from the remote repo until it gets EOF. Depending on what data
pack-objects actually sent, the remote repo may not actually send
anything (e.g., if we sent nothing and it is simply waiting for the
pack). This leads to deadlock cycle in which send-pack waits on the
demuxer, the demuxer waits on the remote receive-pack, and the remote
receive-pack waits on send-pack to send the pack data.
You can test this by compiling with NO_PTHREADS=1 and running the
following script:
-- >8 --
#!/bin/sh
rm -rf parent child
git init --bare parent &&
git init child &&
cd child &&
git remote add origin ../parent &&
echo content >file &&
git add file &&
git commit -m one &&
git push origin HEAD &&
echo content >>file &&
git add file &&
git commit -m two
sha1=`git rev-parse HEAD:file`
file=`echo $sha1 | sed 's,..,&/,'`
rm -fv .git/objects/$file
git push
-- 8< --
The problem bisects to 38a81b4 (receive-pack: Wrap status reports inside
side-band-64k, 2010-02-05). In fact, at that point in time we didn't use
pthreads at all for async calls on non-win32 platforms, so even people
with pthreads are affected. For example, you can trigger it with
v1.7.0.2 even with pthreads, since we didn't use them for async code
back then. That state continued until f6b6098 (Enable threaded async
procedures whenever pthreads is available, 2010-03-09), at which point
the problem went away for pthreads users.
So what I did was build a maint series straight on top of 38a81b4,
the source of the bug:
[1/4]: teach wait_or_whine a "quiet" mode
[2/4]: finish_async: be quiet when waiting for async process
[3/4]: run-command: allow aborting async code prematurely
[4/4]: send-pack: abort sideband demuxer on pack-objects error
The first two are refactoring so that aborting async code does not
produce a stray "child process died with signal 15" message. If they're
too invasive, they can go away and we can live with the extra message.
The third one introduces abort_async, which just kill()s the
forked process for the non-win32 case. For the win32 case, we need to
either:
1. do nothing. I'm not 100% sure why, but the bug does not manifest
itself with pthreads. I don't know how it behaves on win32.
2. do the equivalent of pthread_cancel. This makes more sense to me.
Even if this particular case doesn't have an issue in the threaded
case, having a primitive like abort_async actually kill the thread
is sensible.
So that fixes old versions. To fix newer ones, it needs to be merged
with f6b6098, which has a few conflicts. And then you can apply on top
of that merge:
[5/4]: run-command: implement abort_async for pthreads
which will again break win32, as the compat wrapper doesn't implement
pthread_cancel.
If it's easier to pull than to recreate the merge, you can find the
first series based on 38a81b4 here:
git://github.com/peff/git.git jk/maint-push-async-hang
and then the merge, with conflict resolution, and 5/4 on top here:
git://github.com/peff/git.git jk/maint-push-async-hang-threads
If it's easier, I can also build the fix on top of v1.7.2, which was the
first release with async pthreads, and then we can either call that "old
enough", or I can backport it via cherry-pick to the source.
-Peff
next reply other threads:[~2011-03-31 18:42 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-31 18:42 Jeff King [this message]
2011-03-31 18:43 ` [PATCH 1/4] teach wait_or_whine a "quiet" mode Jeff King
2011-03-31 20:56 ` Johannes Sixt
2011-04-01 1:35 ` Jeff King
2011-03-31 18:44 ` [PATCH 2/4] finish_async: be quiet when waiting for async process Jeff King
2011-03-31 18:44 ` [PATCH 3/4] run-command: allow aborting async code prematurely Jeff King
2011-04-01 9:36 ` Erik Faye-Lund
2011-04-01 13:59 ` Jeff King
2011-03-31 18:44 ` [PATCH 4/4] send-pack: abort sideband demuxer on pack-objects error Jeff King
2011-04-13 19:53 ` Johannes Sixt
2011-04-14 13:54 ` Jeff King
2011-04-14 19:36 ` Johannes Sixt
2011-04-14 20:21 ` Jeff King
2011-04-14 20:43 ` Johannes Sixt
2011-04-14 20:51 ` Jeff King
2011-04-14 21:05 ` Johannes Sixt
2011-04-14 21:21 ` Junio C Hamano
2011-04-24 20:42 ` [PATCH/RFC 1/2] send-pack --stateless-rpc: properly close the outgoing channel Johannes Sixt
2011-04-24 20:49 ` [PATCH 2/2] send-pack: avoid deadlock when pack-object dies early Johannes Sixt
2011-04-25 16:50 ` Jeff King
2011-04-25 17:41 ` Johannes Sixt
2011-04-25 17:51 ` Junio C Hamano
2011-04-25 21:04 ` [PATCH v2] " Johannes Sixt
2011-04-26 8:23 ` Jeff King
2011-04-25 16:40 ` [PATCH/RFC 1/2] send-pack --stateless-rpc: properly close the outgoing channel Jeff King
2011-03-31 18:45 ` [PATCH 5/4] run-command: implement abort_async for pthreads Jeff King
2011-04-01 9:41 ` Erik Faye-Lund
2011-04-01 10:15 ` Erik Faye-Lund
2011-04-01 17:27 ` Johannes Sixt
2011-04-01 17:38 ` Jeff King
2011-04-01 19:26 ` Erik Faye-Lund
2011-04-01 19:33 ` Erik Faye-Lund
2011-04-01 19:42 ` Johannes Sixt
2011-04-01 19:57 ` Erik Faye-Lund
2011-04-01 20:05 ` Jeff King
2011-04-01 20:13 ` Erik Faye-Lund
2011-04-01 20:17 ` Jeff King
2011-04-01 20:18 ` Jeff King
2011-04-01 20:34 ` Erik Faye-Lund
2011-04-01 20:36 ` Johannes Sixt
2011-04-01 20:41 ` Erik Faye-Lund
2011-04-01 20:18 ` Johannes Sixt
2011-04-01 20:31 ` Erik Faye-Lund
2011-04-01 21:16 ` Jeff King
2011-04-02 12:27 ` Erik Faye-Lund
2011-04-01 14:00 ` Jeff King
2011-03-31 20:45 ` [PATCH 0/4] fix hang in git push when pack-objects fails Johannes Sixt
2011-04-01 1:34 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110331184243.GA12027@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).