git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Dirk Süsserott" <newsletter@dirk.my1.cc>
To: Tomas Carnecky <tom@dbservice.com>
Cc: nn6eumtr <nn6eumtr@gmail.com>, git@vger.kernel.org
Subject: Re: How to deal with historic tar-balls
Date: Sun, 01 Jan 2012 20:04:29 +0100	[thread overview]
Message-ID: <4F00AE3D.9050102@dirk.my1.cc> (raw)
In-Reply-To: <4EFFA868.50605@dbservice.com>

Am 01.01.2012 01:27 schrieb Tomas Carnecky:
> On 12/31/11 8:04 PM, nn6eumtr wrote:
>> I have a number of older projects that I want to bring into a git
>> repository. They predate a lot of the popular scm systems, so they are
>> primarily a collection of tarballs today.
>>
>> I'm fairly new to git so I have a couple questions related to this:
>>
>> - What is the best approach for bringing them in? Do I just create a
>> repository, then unpack the files, commit them, clean out the
>> directory unpack the next tarball, and repeat until everything is loaded?
>>
>> - Do I need to pay special attention to files that are renamed/removed
>> from version to version?
>>
>> - If the timestamps change on a file but the actual content does not,
>> will git treat it as a non-change once it realizes the content hasn't
>> changed?
>>
>> - Last, if after loading the repository I find another version of the
>> files that predates those I've loaded, or are intermediate between two
>> commits I've already loaded, is there a way to go say that commit B is
>> actually the ancestor of commit C? (i.e. a->c becomes a->b->c if you
>> were to visualize the commit timeline or do diffs) Or do I just reload
>> the tarballs in order to achieve this?
> 
> There is a script which will import sources from multiple tarballs,
> creating a commit with the contents of each tarball. It's in the git
> repository under contrib/fast-import/import-tars.perl.
> 
> tom

@tom: True. I didn't know about that script, but it should work.

@nn6eumtr: Basically your workflow is perfect. But let me give you some
explanation:

git init
foreach archive in *.tar; do
    tar xf $archive
    git add --all .
    git commit -m "Added $archive"
    # now remove everything except for the .git directory
    # with regular shell commands (rm -rf *). Also remove
    # any dot-files (and the tarball itself, if it's in the
    # current directory).
done

Notice the '--all' switch to 'git add': Normally, 'git add .' adds all
files that match the given pattern '.', i.e. all files in the current
directory (and below, it's recursive). The '--all' switch together with
the pattern '.' adds or updates all files already known to git *AND*
adds the files not yet known *AND* removes the files that are no longer
in the working tree. That's exactly what you want.

Consider archive1.tar with files A, B, C:

  git add --all . # will add A, B, and C

Now remove A, B, C, and unpack archive2.tar. Assume it has files B, C,
D. A was deleted, B was changed, C is unchanged, D is new.

  git add --all . # will remove A, add B, leave C, add D.

git will notice that C hasn't changed its content (timestamp doesn't
matter).

Without the '--all' switch, git would simply add B and D.

There is no problem re-arranging the history after your import (see "git
rebase --help", especially the --interactive section), but then you
probably will have conflicts and have to resolve them. I'd suggest to
re-start the import instead.

Please note that "for archive in *.tar" will pick the tarballs in
lexicographical order. That might not be your intention.

HTH,
    Dirk

  parent reply	other threads:[~2012-01-01 19:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-31 19:04 How to deal with historic tar-balls nn6eumtr
2012-01-01  0:27 ` Tomas Carnecky
2012-01-01 18:30   ` Philip Oakley
2012-01-01 20:54     ` Philip Oakley
2012-01-02 10:07     ` Philip Oakley
2012-01-02 18:26       ` Dirk Süsserott
2012-01-04 20:04         ` Philip Oakley
2012-01-01 19:04   ` Dirk Süsserott [this message]
2012-01-05 15:25 ` Neal Kreitzinger
2012-01-07  1:10   ` nn6eumtr
2012-01-07  1:50     ` Thomas Rast
2012-01-07 19:18     ` Neal Kreitzinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F00AE3D.9050102@dirk.my1.cc \
    --to=newsletter@dirk.my1.cc \
    --cc=git@vger.kernel.org \
    --cc=nn6eumtr@gmail.com \
    --cc=tom@dbservice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).