git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Periodic hang during git index-pack
Date: Thu, 20 Dec 2018 10:10:38 -0500	[thread overview]
Message-ID: <20181220151037.GC27361@sigill.intra.peff.net> (raw)
In-Reply-To: <CALjAwxg2E_48kQYt1GHkcXvVmaFyPY3PGG9rHZNMp+++UqKfow@mail.gmail.com>

On Thu, Dec 20, 2018 at 10:03:45AM +0000, Sitsofe Wheeler wrote:

> > Check the backtrace of git-clone to see why it isn't feeding more data
> > (but note that it will generally have two threads -- one processing the
> > data from the remote, and one wait()ing for index-pack to finish).
> >
> > My guess, though, is that you'll find that git-clone is simply waiting
> > on another pipe: the one from ssh.
> 
> Ok here are backtraces from another run (with SSH multiplexing disabled):

OK, that's about what I expected. Here we have clone's sideband-demux
thread waiting to pull more packfile data from the remote:

> (gdb) thread apply all bt
> 
> Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)):
> #0  0x00007faafc805384 in __libc_read (fd=fd@entry=5,
>     buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5)
>     at ../sysdeps/unix/sysv/linux/read.c:27
> #1  0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5)
>     at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
> [...coming from packet_read / recv_sideband / sideband_demux...]

I assume fd=5 there is a pipe connected to ssh. You could double check
with "lsof" or similar, but I don't think it would ever be reading from
anywhere else.

And then this thread:

> Thread 1 (Thread 0x7faafce3bb80 (LWP 36584)):
> #0  0x00007faafc805384 in __libc_read (fd=fd@entry=7,
>     buf=buf@entry=0x7ffda489ef10, nbytes=nbytes@entry=46)
>     at ../sysdeps/unix/sysv/linux/read.c:27
> #1  0x000055c8ca2f5b23 in read (__nbytes=46, __buf=0x7ffda489ef10, __fd=7)
>     at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
> #2  xread (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, len=len@entry=46)
>     at wrapper.c:260
> #3  0x000055c8ca2f5cdb in read_in_full (fd=7, buf=buf@entry=0x7ffda489ef10,
>     count=count@entry=46) at wrapper.c:318
> #4  0x000055c8ca26a54f in index_pack_lockfile (ip_out=<optimised out>)
>     at pack-write.c:297

...is us waiting on index-pack to single completion by printing out the
name of the pack's .keep file.

And then this is just index-pack:

> (gdb) thread apply all bt
> 
> Thread 1 (Thread 0x7ff8a03dab80 (LWP 36587)):
> #0  0x00007ff89fda434e in __libc_read (fd=fd@entry=0,
>     buf=buf@entry=0x5604bea43d40 <input_buffer>, nbytes=nbytes@entry=4096)
>     at ../sysdeps/unix/sysv/linux/read.c:27
> #1  0x00005604be77bb23 in read (__nbytes=4096,
>     __buf=0x5604bea43d40 <input_buffer>, __fd=0)
>     at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
> #2  xread (fd=0, buf=0x5604bea43d40 <input_buffer>, len=4096) at wrapper.c:260
> #3  0x00005604be5fb069 in fill (min=min@entry=1) at builtin/index-pack.c:255

...waiting for more data.

So though it goes back and forth between two processes, the sequence of
blocking data is linear due to the threads:

  ssh -> git-clone        -> index-pack -> git-clone
         (sideband_demux)                  (index_pack_lockfile)

with each blocking on read() from its predecessor. So you need to find
out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the
likely cause is either:

  1. The git-upload-pack on the remote side stopped generating data for
     some reason. You may or may not have access on the remotehost to
     dig into that.

     It's certainly possible there's a deadlock bug between the server
     and client side of a Git conversation. But I'd find it extremely
     unlikely to find such a deadlock bug at this point in the
     conversation, because at this point the client side has nothing
     left to say to the server. The server should just be streaming out
     the packfile bytes and then closing the descriptor.

     You mentioned "Phabricator sshd scripts" running on the server.
     I don't know what Phabricator might be sticking in the middle of
     the connection, but that could be the source of the stall.

  2. It's possible the network connection dropped but ssh did not
     notice. Maybe try turning on ssh keepalives and seeing if it
     eventually times out?

-Peff

  reply	other threads:[~2018-12-20 15:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-19 22:59 Periodic hang during git index-pack Sitsofe Wheeler
2018-12-19 23:22 ` Jeff King
2018-12-20 10:03   ` Sitsofe Wheeler
2018-12-20 15:10     ` Jeff King [this message]
2018-12-20 16:48       ` Sitsofe Wheeler
2019-02-03  8:16         ` Sitsofe Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181220151037.GC27361@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=sitsofe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).