All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anatoly Borodin <anatoly.borodin@gmail.com>
To: git@vger.kernel.org
Subject: A different bug in git-filter-branch (v2.7.0)
Date: Mon, 8 Feb 2016 23:55:37 +0000 (UTC)	[thread overview]
Message-ID: <n9b9tp$gbr$1@ger.gmane.org> (raw)
In-Reply-To: 20160129231127.GA31798@sigill.intra.peff.net

Hi Jeff,


unfortunately, `--tree-filter true` doesn't solve the problem with the
repo at my work: not all old blobs are replaced with the new ones. I've
made a test repository to demonstrate it; it's a huge one (115M), but I
couldn't make it much smaller, because the bug fails to reproduce if the
repo is not big enough:

https://github.com/anatolyborodin/git-filter-branch-bug

There are some description and instructions in `README.md`. I hope you
will be able to reproduce it on your machine, if not - just add more
files :)

I've debugged the test case and found the place where `git diff-index`
behaves differently regarding `aa/bb.dat`:

read-cache.c +351	ie_match_stat():
...
	if (!changed && is_racy_timestamp(istate, ce)) {
		if (assume_racy_is_modified)
			changed |= DATA_CHANGED;
		else
			changed |= ce_modified_check_fs(ce, st);
	}
...

After `git-checkout-index` the blob hash for `aa/bb.dat` in the index is
88a0f09b9b2e4ccf2faec89ab37d416fba4ee79d (the huge binary), but the file
on disk is a text file "This file was to big, and it has been removed."
with the hash 16e0939430610600301680d5bf8a24a22ff8b6c4.

In the case of a "good behaving" commit, the timestamps of the index and
the cache entry are the same, is_racy_timestamp() returns 1, and
ce_modified_check_fs() finds that the content of the file has changed.
`git diff-index` lists the file `aa/bb.dat`.

In the case of a "bad behaving" commit, the timestamps of the index and
the cache entry are different (the index is 1 second newer),
is_racy_timestamp() returns 0, and the file is assumed unchanged; `git
diff-index` prints nothing.

I don't know if it should be considered to be a bug in in the logic of
`git checkout-index`, or `git diff-index` / `git update-index`.


-- 
Mit freundlichen Grüßen,
Anatoly Borodin

  reply	other threads:[~2016-02-08 23:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 14:46 Bugs in git filter-branch (git replace related) Anatoly Borodin
2016-01-29  6:18 ` Jeff King
2016-01-29 18:24   ` Anatoly Borodin
2016-01-29 23:11     ` Jeff King
2016-02-08 23:55       ` Anatoly Borodin [this message]
2016-02-22 21:13         ` A different bug in git-filter-branch (v2.7.0) Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='n9b9tp$gbr$1@ger.gmane.org' \
    --to=anatoly.borodin@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.