All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dirk Süsserott" <newsletter@dirk.my1.cc>
To: Tomas Carnecky <tom@dbservice.com>
Cc: nn6eumtr <nn6eumtr@gmail.com>, git@vger.kernel.org
Subject: Re: How to deal with historic tar-balls
Date: Sun, 01 Jan 2012 20:04:29 +0100	[thread overview]
Message-ID: <4F00AE3D.9050102@dirk.my1.cc> (raw)
In-Reply-To: <4EFFA868.50605@dbservice.com>

Am 01.01.2012 01:27 schrieb Tomas Carnecky:
> On 12/31/11 8:04 PM, nn6eumtr wrote:
>> I have a number of older projects that I want to bring into a git
>> repository. They predate a lot of the popular scm systems, so they are
>> primarily a collection of tarballs today.
>>
>> I'm fairly new to git so I have a couple questions related to this:
>>
>> - What is the best approach for bringing them in? Do I just create a
>> repository, then unpack the files, commit them, clean out the
>> directory unpack the next tarball, and repeat until everything is loaded?
>>
>> - Do I need to pay special attention to files that are renamed/removed
>> from version to version?
>>
>> - If the timestamps change on a file but the actual content does not,
>> will git treat it as a non-change once it realizes the content hasn't
>> changed?
>>
>> - Last, if after loading the repository I find another version of the
>> files that predates those I've loaded, or are intermediate between two
>> commits I've already loaded, is there a way to go say that commit B is
>> actually the ancestor of commit C? (i.e. a->c becomes a->b->c if you
>> were to visualize the commit timeline or do diffs) Or do I just reload
>> the tarballs in order to achieve this?
> 
> There is a script which will import sources from multiple tarballs,
> creating a commit with the contents of each tarball. It's in the git
> repository under contrib/fast-import/import-tars.perl.
> 
> tom

@tom: True. I didn't know about that script, but it should work.

@nn6eumtr: Basically your workflow is perfect. But let me give you some
explanation:

git init
foreach archive in *.tar; do
    tar xf $archive
    git add --all .
    git commit -m "Added $archive"
    # now remove everything except for the .git directory
    # with regular shell commands (rm -rf *). Also remove
    # any dot-files (and the tarball itself, if it's in the
    # current directory).
done

Notice the '--all' switch to 'git add': Normally, 'git add .' adds all
files that match the given pattern '.', i.e. all files in the current
directory (and below, it's recursive). The '--all' switch together with
the pattern '.' adds or updates all files already known to git *AND*
adds the files not yet known *AND* removes the files that are no longer
in the working tree. That's exactly what you want.

Consider archive1.tar with files A, B, C:

  git add --all . # will add A, B, and C

Now remove A, B, C, and unpack archive2.tar. Assume it has files B, C,
D. A was deleted, B was changed, C is unchanged, D is new.

  git add --all . # will remove A, add B, leave C, add D.

git will notice that C hasn't changed its content (timestamp doesn't
matter).

Without the '--all' switch, git would simply add B and D.

There is no problem re-arranging the history after your import (see "git
rebase --help", especially the --interactive section), but then you
probably will have conflicts and have to resolve them. I'd suggest to
re-start the import instead.

Please note that "for archive in *.tar" will pick the tarballs in
lexicographical order. That might not be your intention.

HTH,
    Dirk

  parent reply	other threads:[~2012-01-01 19:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-31 19:04 How to deal with historic tar-balls nn6eumtr
2012-01-01  0:27 ` Tomas Carnecky
2012-01-01 18:30   ` Philip Oakley
2012-01-01 20:54     ` Philip Oakley
2012-01-02 10:07     ` Philip Oakley
2012-01-02 18:26       ` Dirk Süsserott
2012-01-04 20:04         ` Philip Oakley
2012-01-01 19:04   ` Dirk Süsserott [this message]
2012-01-05 15:25 ` Neal Kreitzinger
2012-01-07  1:10   ` nn6eumtr
2012-01-07  1:50     ` Thomas Rast
2012-01-07 19:18     ` Neal Kreitzinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F00AE3D.9050102@dirk.my1.cc \
    --to=newsletter@dirk.my1.cc \
    --cc=git@vger.kernel.org \
    --cc=nn6eumtr@gmail.com \
    --cc=tom@dbservice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.