git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: "Constantine A. Murenin" <mureninc@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Why git-whatchanged shows a commit touching every file, but git-log doesn't?
Date: Thu, 31 Jan 2013 11:34:34 -0800	[thread overview]
Message-ID: <20130131193434.GG27340@google.com> (raw)
In-Reply-To: <CAPKkNb49FUgLxZxHmQJoqccQ1XVcFYbYF8kYDp0+Y27cmi56fg@mail.gmail.com>

Hi Constantine,

Constantine A. Murenin wrote:

> DragonFly BSD uses git as its SCM, with one single repository and
> branch for both the kernel and the whole userland.
>
> On 2011-11-26 (1322296064), someone did a commit that somehow touched
> every single file in the repository, even though most of the files
> were not modified one bit.

"gitk --simplify-by-decoration" might provide some insight.

In the dragonfly history, it seems that imports of a packages typically
proceed in two steps:

 1. First, the upstream code is imported as a new "initial commit"
    with no history:

	cd ~/src
	git init gcc-4.7.2-import
	cd gcc-4.7.2-import
	tar -xf /path/to/gcc-4.7.2
	mkdir contrib
	mv gcc-4.7.2 contrib/gcc-4.7
	git add .
	git commit -m 'Import gcc-4.7.2 to new vendor branch'

 2. Next, that code is incorporated into dragonfly.

	cd ~/src/dragonfly
	git fetch ../gcc-4.7.2-import master:refs/heads/vendor/GCC47
	git merge vendor/GCC47
	rm -fr ../gcc-4.7.2-import

Unfortunately in the commit you mentioned, someone made a mistake.
Instead of importing a single new upstream package, the author
imported the entire dragonfly tree as a new vendor branch.  Oops.

The effects might be counterintuitive:

 * tools like "git blame" and path-limited "git log" get a choice:
   when looking at the merge that pulled in a copy of dragonfly into
   the existing dragonfly codebase, either parent is an equally
   sensible from blame's point of view as an explanation of the origin
   of this code.  I think both prefer the first parent here, making them
   happen to produce the "right" result.

 * tools like "git show" that describe what change a commit made
   get a choice: when looking at a parentless commit, the diff that
   brings a project into existence may or may not be interesting,
   depending on the situation.

   See
   http://thread.gmane.org/gmane.comp.version-control.git/182571/focus=182577
   for more about that.

But at its heart, this is just an instance of "lie when creating your
history and history-mining tools will lie back to you." :)

Hoping that clarifies a little,
Jonathan

      reply	other threads:[~2013-01-31 19:35 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-31 19:09 Why git-whatchanged shows a commit touching every file, but git-log doesn't? Constantine A. Murenin
2013-01-31 19:34 ` Jonathan Nieder [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130131193434.GG27340@google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mureninc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).