git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Ivankov <divanorama@gmail.com>
To: git@vger.kernel.org
Cc: Jonathan Nieder <jrnieder@gmail.com>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	David Barr <davidbarr@google.com>,
	Dmitry Ivankov <divanorama@gmail.com>
Subject: [PATCH/WIP 6/7] fast-import: workaround data corruption
Date: Thu, 28 Jul 2011 10:46:09 +0600	[thread overview]
Message-ID: <1311828370-30477-7-git-send-email-divanorama@gmail.com> (raw)
In-Reply-To: <1311828370-30477-1-git-send-email-divanorama@gmail.com>

fast-import keeps track of some delta-base for tree objects. When it is
time to compute the delta, base object is constructed from in-memory
tree representation using children's delta bases sha1. But these can be
unrelated due to several bugs, and it leads to object with wrong sha1
being delta-written to the packfile.

We have the base sha1 and what we think it's data is. Verify sha1 and if
it doesn't match, report it to stderr and don't use delta for this tree.

We could also die() here when bugs are fixed. Or we can see if the data
we've got is from our pack file and so still try to use it as a base.

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
---
 fast-import.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 9f0d2fe..14a2a63 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1469,7 +1469,8 @@ static void drop_old(struct tree_entry *root)
 static void store_tree(struct tree_entry *root)
 {
 	struct tree_content *t = root->tree;
-	struct last_object lo = { STRBUF_INIT, 0, 0, /* no_swap */ 1 };
+	struct strbuf empty = STRBUF_INIT;
+	struct last_object lo = { empty, 0, 0, /* no_swap */ 1 };
 	struct object_entry *le;
 	unsigned int i;
 
@@ -1486,10 +1487,21 @@ static void store_tree(struct tree_entry *root)
 
 	le = find_object(root->versions[0].sha1);
 	if (S_ISDIR(root->versions[0].mode) && le && le->pack_id == pack_id) {
+		unsigned char sh[20];
 		mktree(t, 0, &old_tree);
 		lo.data = old_tree;
 		lo.offset = le->idx.offset;
 		lo.depth = t->delta_depth;
+
+		prepare_object_hash(OBJ_TREE, &old_tree, NULL, NULL, sh);
+		if (hashcmp(sh, root->versions[0].sha1)) {
+			fprintf(stderr, "internal sha1 delta base mismatch,"
+					" won't use delta for that tree\n");
+			lo.data = empty;
+			lo.offset = 0;
+			lo.depth = 0;
+		}
+
 	}
 
 	mktree(t, 1, &new_tree);
-- 
1.7.3.4

  parent reply	other threads:[~2011-07-28  4:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-28  4:46 [PATCH/WIP 0/7] was: long fast-import errors out "failed to apply delta" Dmitry Ivankov
2011-07-28  4:46 ` [PATCH/WIP 1/7] fast-import: extract object preparation function Dmitry Ivankov
2011-07-28  4:46 ` [PATCH/WIP 2/7] fast-import: be saner with temporary trees Dmitry Ivankov
2011-07-28  7:27   ` Jonathan Nieder
2011-07-28  4:46 ` [PATCH/WIP 3/7] fast-import: fix a data corruption in parse_ls Dmitry Ivankov
2011-07-28  7:34   ` Jonathan Nieder
2011-07-28  4:46 ` [PATCH/WIP 4/7] fast-import: fix data corruption in store_tree Dmitry Ivankov
2011-07-28  7:42   ` Jonathan Nieder
2011-07-28  8:11     ` Dmitry Ivankov
2011-07-28  4:46 ` [PATCH/WIP 5/7] fast-import: extract tree_content reading function Dmitry Ivankov
2011-07-28  4:46 ` Dmitry Ivankov [this message]
2011-07-28  6:31   ` [PATCH/WIP 6/7] fast-import: workaround data corruption Jonathan Nieder
2011-07-28  4:46 ` [PATCH/WIP 7/7] fast-import: fix data corruption in load_tree Dmitry Ivankov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1311828370-30477-7-git-send-email-divanorama@gmail.com \
    --to=divanorama@gmail.com \
    --cc=davidbarr@google.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).