git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables
Date: Fri, 4 Sep 2015 08:54:49 -0400	[thread overview]
Message-ID: <20150904125448.GA25501@sigill.intra.peff.net> (raw)
In-Reply-To: <CAOxFTcx9E_FtYD_Jn3+S3j-rwGO+hJSVXUC2S5ZBB6der7dUuw@mail.gmail.com>

On Fri, Sep 04, 2015 at 12:52:45PM +0200, Giuseppe Bilotta wrote:

> Trying to push any changes with 2.5 resulted in this kind of failure:
> 
> user@clientmachine:~/some/git/workdir $ git push
> Counting objects: 6, done.
> Delta compression using up to 8 threads.
> Compressing objects: 100% (5/5), done.
> Writing objects: 100% (6/6), 841 bytes | 0 bytes/s, done.
> Total 6 (delta 2), reused 0 (delta 0)
> fatal: Could not switch to '/home/user/some/git': No such file or directory
> error: unpack failed: unpack-objects abnormal exit
> fatal: Could not switch to '/home/user/some/git': No such file or directory
> To git@remote.machine:remote-repo
> ! [remote rejected] master -> master (n/a (unpacker error))
> error: failed to push some refs to 'git@remote.machine:remote-repo'
>
> Notice two things: the messages refer to the worktree updir of the
> CLIENT machine, and even though it's _completely not obvious_ due to
> the missing 'remote:' lines, the messages actually come from the
> SERVER. The lack of indicator lines _alone_ took me hours of debugging
> before I finally understood that they were coming from the other side

Older versions of receive-pack would let unpack-objects output go
straight to stderr, but that changed in a22e6f8 (receive-pack: send
pack-processing stderr over sideband, 2012-09-21), which is in git
v1.7.12.3. What version of git is running on the remote server?

E.g., even without going over ssh, if I do:

  git init
  echo content >file && git add file && git commit -m foo
  git init --bare dst.git
  # force unpacker to fail
  chmod -w dst.git/objects
  git push dst.git

I get:

  Counting objects: 3, done.
  Writing objects: 100% (3/3), 205 bytes | 0 bytes/s, done.
  Total 3 (delta 0), reused 0 (delta 0)
  remote: error: insufficient permission for adding an object to repository database ./objects
  remote: fatal: failed to write object
  error: unpack failed: unpack-objects abnormal exit
  To dst.git
   ! [remote rejected] master -> master (unpacker error)
  error: failed to push some refs to 'dst.git'

The "unpack failed" line _does_ come from the remote, but comes straight
from receive-pack, not the child unpack-objects. Receive-pack
distinguishes between errors which should go to the client and which
just go to stderr for debugging (remember that not all transports
actually propagate stderr to the client; ssh is special here). It's
possible we could switch the "unpack failed" to go to the client, but it
is redundant with the "unpacker error" which _does_ go to the client.

> I still couldn't do much on the SERVER to debug due to a variety of
> reasons, but I finally had a suspicion: it was almost as if the SERVER
> was getting the GIT_DIR information from the CLIENT. And why the heck
> would _that_ be the case?
> 
> I then remembered that the server was actually configured to AcceptEnv
> GIT_* in sshd_config, for reasons related to git identity preservation
> despite single login account (please don't ask). Turning the AcceptEnv
> to a stricter GIT_AUTHOR* and GIT_COMMITTER* solved the issue.

I couldn't reproduce this problem, either during a local push, or across
an ssh session (where the client has "SendEnv GIT_*" and the server has
"AcceptEnv GIT_*").

In the local case, we explicitly unset GIT_DIR and related variables in
connect.c:git_connect. We don't seem to do so for the ssh case, though.
I can confirm that the variable makes it across to the remote:

  GIT_DIR=$PWD git push \
    --receive-pack='echo >&2 GIT_DIR=$GIT_DIR; git-receive-pack' \
    remote-host:remote-repo

shows our local $PWD on the remote side (though note that you have to
explicitly set $GIT_DIR; git-push does not do so normally).

On the receiving side, git-receive-pack takes an argument for the repo
path, and calls enter_repo. That should result in calling set_git_dir(),
which overwrites $GIT_DIR in the environment. AFAICT, it has always done
so. So I'm not sure how GIT_DIR would leak through, even on an older
version of git.

> 1. since when have git internals started exporting GIT_* variables
> related to the git dir and worktree location?

It has done so for a long time, though the exact rules for doing so
changes from time to time. Browsing "git log", I couldn't find any
recent changes in this area. It would help if you could bisect the
problem, as I can't manage to replicate it.

> 2. is it worth making sure that these don't get propagated via ssh?

It seems like a reasonable idea for git_connect() to do so in the ssh
case, as well as the local case. That _could_ be a regression for
somebody who uses an ssh-wrapper whose behavior changes based on some
$GIT_* variable, but that seems a bit far-fetched.

It shouldn't be necessary for $GIT_DIR, but it makes sense for other git
variables. E.g., with "AcceptEnv GIT_*", "git -c" config is propagated.
E.g.:

  # make a syntactically bogus commit
  commit=$(git cat-file commit HEAD |
           sed 's/>/>>/' |
	   git hash-object -t commit -w --stdin)
  git update-ref HEAD $commit
  # confirm that git complains
  git fsck

  # confirm that it gets blocked by receive.fsckObjects; the "-c" is
  # applied to the receive-pack directly, so we are already in the
  # "remote" repo.
  git push --receive-pack='git -c receive.fsckobjects=true receive-pack' dst.git

  # now let's try it with a local "-c". This doesn't get propagated, as
  # we clean out any environment-level config before running
  # receive-pack in the "remote". As a result, the push succeeds.
  git -c receive.fsckobjects=true push dst.git

  # and now the same thing over ssh; this _will_ complain, as we
  # propagated the config.
  git -c receive.fsckObjects=true push remote:dst.git

-Peff

  reply	other threads:[~2015-09-04 12:54 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 10:52 [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables Giuseppe Bilotta
2015-09-04 12:54 ` Jeff King [this message]
2015-09-04 14:02   ` Giuseppe Bilotta
2015-09-04 14:26     ` Jeff King
2015-09-04 18:18   ` Junio C Hamano
2015-09-04 21:10     ` Giuseppe Bilotta
2015-09-04 21:44     ` Jeff King
2015-09-04 22:40       ` [PATCH] git_connect: clear GIT_* environment for ssh Jeff King
2015-09-05 13:50         ` Giuseppe Bilotta
2015-09-08  8:33           ` [PATCH] git_connect: clarify conn->use_shell flag Jeff King
2015-09-08 17:18             ` Junio C Hamano
2015-09-08 21:40               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150904125448.GA25501@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=giuseppe.bilotta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).