From: Dmitry Ivankov <divanorama@gmail.com>
To: git@vger.kernel.org
Cc: Jonathan Nieder <jrnieder@gmail.com>,
"Shawn O. Pearce" <spearce@spearce.org>,
David Barr <davidbarr@google.com>,
Dmitry Ivankov <divanorama@gmail.com>
Subject: [PATCH/WIP 0/7] was: long fast-import errors out "failed to apply delta"
Date: Thu, 28 Jul 2011 10:46:03 +0600 [thread overview]
Message-ID: <1311828370-30477-1-git-send-email-divanorama@gmail.com> (raw)
A very short summary. It was found [1] that fast-import sometimes can produce
broken packfiles (sha1 mismatch) or even wrong packfiles (data differs
from what was expected). This happens mostly on not so tiny svn-to-git or
even cvs-to-svn-to-git imports with all these copying across the tree
(simulating tags/branches as a directories in git, for example). But I won't
be surprised if this can happen without these operations too.
Technically, fast-import has in-memory tree representation where it stores
sha1's of some previous tree states (to make delta on them), but when it comes
to producing the delta, old sha1's tree content is fetched from the in-memory
node and it's children (not via sha1->object lookup). And these can turn out
to be unrelated to each other as some operations changes the children's states.
The most wanted bit for these patches is small testcases. Keeping in mind all
the in-memory tree state and fast-import logic is hard for me, so I wasn't able
to create small tests (the best is [2] - 15M archive + custom git builds + fix the
Makefile in [2] + a few minutes to reproduce).
Another good todo is to always avoid base sha1's mismatch (not just to avoid
corruption if it is detected). I think I can do this, but I won't be sure in
the code unless there is a bunch of good tests, this series is quite big already.
[1] http://thread.gmane.org/gmane.comp.version-control.git/176753
[2] http://thread.gmane.org/gmane.comp.version-control.git/176753/focus=177901
Dmitry Ivankov (7):
fast-import: extract object preparation function
fast-import: be saner with temporary trees
fast-import: fix a data corruption in parse_ls
fast-import: fix data corruption in store_tree
fast-import: extract tree_content reading function
fast-import: workaround data corruption
fast-import: fix data corruption in load_tree
fast-import.c | 169 +++++++++++++++++++++++++++++++++++++++++++++------------
1 files changed, 135 insertions(+), 34 deletions(-)
--
1.7.3.4
next reply other threads:[~2011-07-28 4:43 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-28 4:46 Dmitry Ivankov [this message]
2011-07-28 4:46 ` [PATCH/WIP 1/7] fast-import: extract object preparation function Dmitry Ivankov
2011-07-28 4:46 ` [PATCH/WIP 2/7] fast-import: be saner with temporary trees Dmitry Ivankov
2011-07-28 7:27 ` Jonathan Nieder
2011-07-28 4:46 ` [PATCH/WIP 3/7] fast-import: fix a data corruption in parse_ls Dmitry Ivankov
2011-07-28 7:34 ` Jonathan Nieder
2011-07-28 4:46 ` [PATCH/WIP 4/7] fast-import: fix data corruption in store_tree Dmitry Ivankov
2011-07-28 7:42 ` Jonathan Nieder
2011-07-28 8:11 ` Dmitry Ivankov
2011-07-28 4:46 ` [PATCH/WIP 5/7] fast-import: extract tree_content reading function Dmitry Ivankov
2011-07-28 4:46 ` [PATCH/WIP 6/7] fast-import: workaround data corruption Dmitry Ivankov
2011-07-28 6:31 ` Jonathan Nieder
2011-07-28 4:46 ` [PATCH/WIP 7/7] fast-import: fix data corruption in load_tree Dmitry Ivankov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1311828370-30477-1-git-send-email-divanorama@gmail.com \
--to=divanorama@gmail.com \
--cc=davidbarr@google.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).