git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Best way to check for a "dirty" working tree?
@ 2011-06-11 14:54 Dirk Süsserott
  2011-06-12 12:23 ` Ramkumar Ramachandra
  2011-06-13 22:22 ` Jonathan Nieder
  0 siblings, 2 replies; 4+ messages in thread
From: Dirk Süsserott @ 2011-06-11 14:54 UTC (permalink / raw)
  To: Git Mailing List

Hi list,

I have a script which moves data from somewhere to my local repo and
then checks it in, like so:

-----------
mv /tmp/foo.bar .
git commit -am "Updated foo.bar at $timestamp"
-----------

However, before overwriting "foo.bar" in my working directory, I'd like
to check whether my working tree is dirty (at least "foo.bar").

I tried

A) if ! git diff-index --quiet HEAD -- foo.bar; then
       dirty=1
   fi

and

B) if ! git diff --quiet -- foo.bar; then
       dirty=1
   fi

Both A) and B) work. But which one is better/faster/more reliable? Or is
there a better solution? For my purpose, I cannot see a difference
between diff and diff-index, except the syntax.

Cheers,
    Dirk

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Best way to check for a "dirty" working tree?
  2011-06-11 14:54 Best way to check for a "dirty" working tree? Dirk Süsserott
@ 2011-06-12 12:23 ` Ramkumar Ramachandra
  2011-06-13 22:22 ` Jonathan Nieder
  1 sibling, 0 replies; 4+ messages in thread
From: Ramkumar Ramachandra @ 2011-06-12 12:23 UTC (permalink / raw)
  To: Dirk Süsserott; +Cc: Git Mailing List

Hi Dirk,

Dirk Süsserott writes:
> A) if ! git diff-index --quiet HEAD -- foo.bar; then
>       dirty=1
>   fi
>
> and
>
> B) if ! git diff --quiet -- foo.bar; then
>       dirty=1
>   fi
>
> Both A) and B) work. But which one is better/faster/more reliable? Or is
> there a better solution? For my purpose, I cannot see a difference
> between diff and diff-index, except the syntax.

diff is a more porcelain'ish command, while diff-index is closer to
the plumbing.  Therefore, diff contains some extra argument parsing/
pretty printing code that your script doesn't utilize -- use
diff-index.  Also, look at the various scripts in git.git to see what
they use; for example, require_clean_work_tree in git-sh-setup.sh.

-- Ram

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Best way to check for a "dirty" working tree?
  2011-06-11 14:54 Best way to check for a "dirty" working tree? Dirk Süsserott
  2011-06-12 12:23 ` Ramkumar Ramachandra
@ 2011-06-13 22:22 ` Jonathan Nieder
  2011-06-14 13:28   ` Dirk Süsserott
  1 sibling, 1 reply; 4+ messages in thread
From: Jonathan Nieder @ 2011-06-13 22:22 UTC (permalink / raw)
  To: Dirk Süsserott; +Cc: Git Mailing List, Ramkumar Ramachandra

Hi Dirk,

Dirk Süsserott wrote:

> I have a script which moves data from somewhere to my local repo and
> then checks it in, like so:
>
> -----------
> mv /tmp/foo.bar .
> git commit -am "Updated foo.bar at $timestamp"
> -----------
>
> However, before overwriting "foo.bar" in my working directory, I'd like
> to check whether my working tree is dirty (at least "foo.bar").

Interesting example.  Sensible, as long as you limit the commit to
foo.bar (i.e., "git commit -m ... --only foo.bar")!

> I tried
>
> A) if ! git diff-index --quiet HEAD -- foo.bar; then
>        dirty=1
>    fi

To piggy-back on what Ram wrote, this is a question about the
difference between porcelain (high-level) and plumbing (low-level)
commands.

Generally speaking, plumbing is meant to give more stable behavior for
scripts, in two ways:

 - On one hand we make a concerted effort to keep the command-line
   usage and output of plumbing stable.  By contrast, porcelain will
   change over time as we learn about the way people work.

 - On the other hand plumbing is designed to produce simple, reliable,
   and machine-friendly behavior.  For example, while "git checkout"
   will guess what the caller is trying to do based on whether its
   first argument is a branch name or a file, "git checkout-index"
   only accepts pathspecs.  Plumbing tends to produce parseable
   output and not to automatically spawn a pager when its output is
   going to the terminal or to change behavior based on configuration.

Now, a word of warning.  One aspect of this "do not second-guess the
caller" behavior is that low-level commands like "git diff-index"
blindly trust stat() information in the index, rather than going to
re-read a seemingly modified file and updating the index if the
content is not changed.  You can see this by running "touch foo.bar";
"git diff-index" will report the file as changed, until you use "git
update-index" to refresh the stat information:

	git update-index --refresh --unmerged -q >/dev/null || :
	if ! git diff-index --quiet HEAD -- foo.bar; then
		dirty=1
	fi

Alas, this doesn't seem to be documented anywhere (except for the
gitcore-tutorial(7))!  It ought to be.

> Both A) and B) work. But which one is better/faster/more reliable?

I suspect the fastest (by virtue of saving a fork + exec and not
having to stat files twice, once for update-index and again for
diff-index) is

	git -c diff.autorefreshindex=true diff --quiet -- foo.bar

by a sad accident of history --- the "opportunistic index refresh"
behavior it implements does not seem to be exposed as plumbing.
If you are going to be performing such operations in a loop, then

	git update-index --refresh --unmerged -q >/dev/null || :
	for i in loop
	do
		... actions like diff-index that trust the index ...
	done

will be faster.  And the latter is plumbing, with all the niceties
that entails, so if I were in your shoes I'd use the latter.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Best way to check for a "dirty" working tree?
  2011-06-13 22:22 ` Jonathan Nieder
@ 2011-06-14 13:28   ` Dirk Süsserott
  0 siblings, 0 replies; 4+ messages in thread
From: Dirk Süsserott @ 2011-06-14 13:28 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Git Mailing List, Ramkumar Ramachandra

Hi Jonathan,

Am 14.06.2011 00:22 schrieb Jonathan Nieder:
> Hi Dirk,
> 
> Dirk Süsserott wrote:
> 
>> I have a script which moves data from somewhere to my local repo and
>> then checks it in, like so:
>>
>> -----------
>> mv /tmp/foo.bar .
>> git commit -am "Updated foo.bar at $timestamp"
>> -----------
>>
>> However, before overwriting "foo.bar" in my working directory, I'd like
>> to check whether my working tree is dirty (at least "foo.bar").
> 
> Interesting example.  Sensible, as long as you limit the commit to
> foo.bar (i.e., "git commit -m ... --only foo.bar")!

Uhh, nice hint. I didn't know that git-commit accepts a path, too.
That's safer. However, in my particular case the working tree is either
clean or exactly the file in question has changed. If sth. else changes
(e.g. my commit-script) I do that in a separate "transaction".

> Now, a word of warning.  One aspect of this "do not second-guess the
> caller" behavior is that low-level commands like "git diff-index"
> blindly trust stat() information in the index, rather than going to
> re-read a seemingly modified file and updating the index if the
> content is not changed.  You can see this by running "touch foo.bar";
> "git diff-index" will report the file as changed, until you use "git
> update-index" to refresh the stat information:
> 
> 	git update-index --refresh --unmerged -q >/dev/null || :
> 	if ! git diff-index --quiet HEAD -- foo.bar; then
> 		dirty=1
> 	fi
> 
> Alas, this doesn't seem to be documented anywhere (except for the
> gitcore-tutorial(7))!  It ought to be.

Hmm, it MUST be documented somewhere, because I have several scripts
that use "update-index --refresh" to get rid of what I call "phantom
changes": sometimes I transfer (scp) files from a remote machine to the
local tree. The set of files is already known to Git, so my first guess
was that Gitk would only show the "real" diff, but it actually showed
*all* transferred files as changed. After running "git status" Gitk does
it right and shows only content's diff. Surprisingly, "git status" seems
to be a read/write operation and does "update-index --refresh" in the
background. After some research I learned about "update-index --refresh"
and use it frequently for scp'ed files.

Unfortunately, I cannot remember *where* I learned about it.

> Hope that helps,
> Jonathan

That helped a lot. Thank you,
Dirk

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-06-14 13:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-11 14:54 Best way to check for a "dirty" working tree? Dirk Süsserott
2011-06-12 12:23 ` Ramkumar Ramachandra
2011-06-13 22:22 ` Jonathan Nieder
2011-06-14 13:28   ` Dirk Süsserott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).