From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org
Subject: Re: import determinism
Date: Thu, 11 Nov 2010 05:28:00 +0100 [thread overview]
Message-ID: <4CDB70D0.6000405@alum.mit.edu> (raw)
In-Reply-To: <20101109134337.GA19430@nibiru.local>
On 11/07/2010 09:25 PM, Enrico Weigelt wrote:
> I'm curious on how deterministic the imports (git-cvsimport and
> git-svn) are. Suppose I close the same cvs repo twice (assuming
> no write access in between), are the resulting object SHA-1's
> the same ?
On 11/09/2010 02:43 PM, Enrico Weigelt wrote:
> The point behind this is: I'm running a growing number of cvs2git
> mirrors and dont want to do full backups of them.
If you are using cvs2git, why are you asking about git-cvsimport and
git-svn?
No tool that imports from CVS or Subversion can make a blanket guarantee
about consistency across conversions because both CVS and SVN allow
retroactive changes to the project history. For example:
* Both CVS and SVN allow commit messages and other metadata of old
commits to be changed
* CVS allows files to be added retroactively to tags and branches with
no timestamp indicating that the file was not part of the original tag.
* CVS allows old revisions to be "obsoleted" (i.e., expunged from history).
* In CVS it is common practice for people to muck about directly in the
repository, for example renaming *,v files.
So (in the general case) there is no way to guarantee that two
independent conversions will have consistent results for the overlapping
parts of their history. And even incremental conversions will
necessarily have to decide between converting the current state of the
repository accurately and converting in a way that is consistent with
earlier conversions.
In practice, especially if you are willing to constrain what the CVS
users are allowed to do, the overlapping parts of two conversions should
usually be identical or at least very similar (with older history more
likely to be identical). Perhaps an rsync-style backup would be smart
enough to copy only the changed part of the history without excluding
the possibility that there are retroactive changes between subsequent
conversions.
If you run two cvs2git conversions on *exactly* the same CVS repository,
then the results *should* be identical. I have tried always to process
data in a defined order rather than, say, in filesystem or
hashmap-determined order. But AFAIK this property has not been tested
and could easily be buggy if I overlooked some source of indeterminism
somewhere in the cvs2git code.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
next prev parent reply other threads:[~2010-11-11 4:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-07 20:25 import determinism Enrico Weigelt
2010-11-07 20:46 ` Ævar Arnfjörð Bjarmason
2010-11-07 21:01 ` Andreas Schwab
2010-11-07 21:56 ` Enrico Weigelt
2010-11-07 22:20 ` Martin Langhoff
2010-11-07 22:45 ` Andreas Schwab
2010-11-09 13:43 ` Enrico Weigelt
2010-11-10 4:40 ` Martin Langhoff
2010-11-10 16:18 ` Enrico Weigelt
2010-11-10 21:25 ` Martin Langhoff
2010-11-10 22:04 ` Enrico Weigelt
2010-11-11 4:28 ` Michael Haggerty [this message]
2010-11-11 13:09 ` Enrico Weigelt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CDB70D0.6000405@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).