git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: "Dirk Süsserott" <newsletter@dirk.my1.cc>
Cc: Git Mailing List <git@vger.kernel.org>,
	Ramkumar Ramachandra <artagnon@gmail.com>
Subject: Re: Best way to check for a "dirty" working tree?
Date: Mon, 13 Jun 2011 17:22:48 -0500	[thread overview]
Message-ID: <20110613222225.GA14446@elie> (raw)
In-Reply-To: <4DF381BF.3050301@dirk.my1.cc>

Hi Dirk,

Dirk Süsserott wrote:

> I have a script which moves data from somewhere to my local repo and
> then checks it in, like so:
>
> -----------
> mv /tmp/foo.bar .
> git commit -am "Updated foo.bar at $timestamp"
> -----------
>
> However, before overwriting "foo.bar" in my working directory, I'd like
> to check whether my working tree is dirty (at least "foo.bar").

Interesting example.  Sensible, as long as you limit the commit to
foo.bar (i.e., "git commit -m ... --only foo.bar")!

> I tried
>
> A) if ! git diff-index --quiet HEAD -- foo.bar; then
>        dirty=1
>    fi

To piggy-back on what Ram wrote, this is a question about the
difference between porcelain (high-level) and plumbing (low-level)
commands.

Generally speaking, plumbing is meant to give more stable behavior for
scripts, in two ways:

 - On one hand we make a concerted effort to keep the command-line
   usage and output of plumbing stable.  By contrast, porcelain will
   change over time as we learn about the way people work.

 - On the other hand plumbing is designed to produce simple, reliable,
   and machine-friendly behavior.  For example, while "git checkout"
   will guess what the caller is trying to do based on whether its
   first argument is a branch name or a file, "git checkout-index"
   only accepts pathspecs.  Plumbing tends to produce parseable
   output and not to automatically spawn a pager when its output is
   going to the terminal or to change behavior based on configuration.

Now, a word of warning.  One aspect of this "do not second-guess the
caller" behavior is that low-level commands like "git diff-index"
blindly trust stat() information in the index, rather than going to
re-read a seemingly modified file and updating the index if the
content is not changed.  You can see this by running "touch foo.bar";
"git diff-index" will report the file as changed, until you use "git
update-index" to refresh the stat information:

	git update-index --refresh --unmerged -q >/dev/null || :
	if ! git diff-index --quiet HEAD -- foo.bar; then
		dirty=1
	fi

Alas, this doesn't seem to be documented anywhere (except for the
gitcore-tutorial(7))!  It ought to be.

> Both A) and B) work. But which one is better/faster/more reliable?

I suspect the fastest (by virtue of saving a fork + exec and not
having to stat files twice, once for update-index and again for
diff-index) is

	git -c diff.autorefreshindex=true diff --quiet -- foo.bar

by a sad accident of history --- the "opportunistic index refresh"
behavior it implements does not seem to be exposed as plumbing.
If you are going to be performing such operations in a loop, then

	git update-index --refresh --unmerged -q >/dev/null || :
	for i in loop
	do
		... actions like diff-index that trust the index ...
	done

will be faster.  And the latter is plumbing, with all the niceties
that entails, so if I were in your shoes I'd use the latter.

Hope that helps,
Jonathan

  parent reply	other threads:[~2011-06-13 22:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-11 14:54 Best way to check for a "dirty" working tree? Dirk Süsserott
2011-06-12 12:23 ` Ramkumar Ramachandra
2011-06-13 22:22 ` Jonathan Nieder [this message]
2011-06-14 13:28   ` Dirk Süsserott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110613222225.GA14446@elie \
    --to=jrnieder@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newsletter@dirk.my1.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).