git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Juan Jose Comellas" <juanjo@comellas.org>
To: "Brian Downing" <bdowning@lavos.net>
Cc: git@vger.kernel.org, monotone-devel@nongnu.org
Subject: Re: [RFC] mtn to git conversion script
Date: Tue, 11 Nov 2008 14:40:46 -0200	[thread overview]
Message-ID: <1c3be50f0811110840p7dff8972r2a5ef7193a0306c2@mail.gmail.com> (raw)
In-Reply-To: <20080825163530.GJ31114@lavos.net>

[-- Attachment #1: Type: text/plain, Size: 2761 bytes --]

I made some modifications to the script that converts Monotone
repositories to Git to make it work with what I had. I also added
support for renaming commit authors.

To use the modified script just call it passing the name of the
repository file as the first argument. You can add a second optional
argument with the name of the file that holds the authors' names and
email addresses. In this file you should have one line per commit
author with the following format:

Firstname Lastname <email@example.com>

This script still uses the AutomateStdio.pm Perl module that can be
found in the net.venge.monotone.contrib.lib.automate-stdio branch of
Monotone's main repository.

I'm no Perl guru so there might be some bugs lurking in the code I
added. It did work for my repositories, though.

PS. Resending because I mistakenly sent the previous message as HTML mail.


On Mon, Aug 25, 2008 at 2:35 PM, Brian Downing <bdowning@lavos.net> wrote:
>
> On Sun, Aug 24, 2008 at 12:18:50PM +0300, Felipe Contreras wrote:
> > I developed a script that converts a monotone repository into a git
> > one (exact clone), I want to contribute it so everybody can use it.
> >
> > This is the gist of the script:
> >
> > mtn update --revision #{@id} --reallyquiet
> > git ls-files --modified --others --exclude-standard -z | git
> > update-index --add --remove -z --stdin
> > git write-tree
> > git write-raw < /tmp/commit.txt
> > git update-ref refs/mtn/#{@id} #{@git_id}
> >
> > branches.each do |e|
> >     git update-ref refs/heads/#{e} #{@git_id}
> > end
>
> You definitely want to use fast-import, but you probably want to do
> something a lot closer to fast-export for monotone (read: use its
> automate stdio interface and avoid expensive calls).
>
> Here's a simple monotone to git converter I wrote.  You'll need the
> Monotone::AutomateStdio perl module to use it (which I think I got it
> from monotone's net.venge.monotone.contrib.lib.automate-stdio branch).
> It is very fast; it can convert the OpenEmbedded repo in something like
> 5-10 minutes on my machine.
>
> Note that for monotone export to go fast you absolutely /must/ avoid the
> get_manifest operation.  In my converter I use the revision information
> directly.  Getting the renames right with this is a little tricky; IIRC,
> the ordering that works is:
>
> * Rename all renamed files, innermost files first, to temporary names.
> * Delete all deleted files, innermost first.
> * Rename all temporary names to permanent names, outermost first.
> * Add all new/modified files.
>
> Conveniently, all of the above can be done by directly emitting
> fast-import commands, so you don't have to keep track of trees directly.
> (With one exception, which I'll elaborate on in a different email.)
>
> -bcd

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mtn-to-git.pl --]
[-- Type: text/x-perl; name=mtn-to-git.pl, Size: 5793 bytes --]

#!/usr/bin/perl
# Copyright (C) 2007-2008  Brian Downing
# This program is licensed under version 2 of the GNU GPL.

use strict;
use Monotone::AutomateStdio;
use Date::Parse;

my $D = 0;

my $m = Monotone::AutomateStdio->new($ARGV[0]);

my $revlist = [];
$m->graph($revlist);
my $sorted = [];

for my $rev (@$revlist) {
    push(@$sorted, $rev->{revision_id});
}

my $leaves = [];
$m->leaves($leaves);

$m->toposort($sorted, @$sorted);

my $marks = {};
my $mark = 1;
my $c = 0;
my %authors = {};

sub quote_file {
    $_ = shift;
    return $_;
    s/\\/\\\\/g;
    s/\n/\\n/g;
    s/"/\\"/g;
    return qq("$_");
}

sub lprint {
    my $fh = shift;
    print @_ if $D;
    print $fh @_;
}

sub lprintf {
    my $fh = shift;
    printf @_ if $D;
    printf $fh @_;
}

sub load_authors {
    my %authors = ();
    my $filename = shift(@_);
    open my $fi, '<', $filename or die "Could not open authors map file $filename\n"; 
    for my $line (<$fi>) {
        if ($line =~ m/(.*) <(.*)>/) {
            if ($2) {
                $authors{$2} = $1;
            }
        }
    }
    return %authors;
}

sub author_name {
    my $email = shift(@_);
    my $name = $authors{$email};
    if ($name) {
        return $name;
    } else {
        if ($email =~ m/(.+)\@.+/) {
            return $1;
        } else {
            return $email;
        }
    }
}

my $tmptag = "624d893e-ae1a-42d8-90a9-926a6ceffae8";

if ($ARGV[1]) {
    %authors = load_authors($ARGV[1]);
} else {
    %authors = {};
}
open my $fi, '|-', 'git-fast-import --export-marks=file';
# open my $fi, '>fast-import.dump';
binmode $fi;
for my $rev (@$sorted) {
    my ($time, $author, $msg) = ("0", "__UNKNOWN__", "__UNKNOWN__");
    my @certs;
    my @branches;
    $m->certs(\@certs, $rev);
    for my $cert (@certs) {
        my ($n, $v) = ($cert->{name}, $cert->{value});
        $author = $v if ($n eq 'author');
        $time = $v if ($n eq 'date');
        $msg = $v if ($n eq 'changelog');
        push(@branches, $v) if ($n eq 'branch');
    }
    my $email = $author;
    $msg .= "\nmtn-revision: $rev\n";
    for my $b (sort @branches) {
        $msg .= "mtn-branch: $b\n";
    }
    $time = str2time($time, 'UTC');
    my $mfest = [];
    $m->get_revision($mfest, $rev);
    my $orcount = 0;
    my $add_files = {};
    my $add_dirs = {};
    my $delete_files = {};
    my $from_tmpnames = {};
    my $to_tmpnames = {};
    my $curtmp = 0;
    my @parents;
    for my $e (@$mfest) {
        if ($e->{type} eq 'old_revision') {
            push(@parents, $e->{revision_id});
            ++$orcount;
        } 
        next if $orcount > 1;
        if ($e->{type} eq 'add_file' || $e->{type} eq 'patch') {
            my $id = $e->{file_id} || $e->{to_file_id};
            $add_files->{$e->{name}} = $id;
            unless ($marks->{$id}) {
                my $data;
                $m->get_file(\$data, $id);
                print "new file $id\n" if $D;
                print $fi "blob\n";
                my $len = length($data);
                print $fi "mark :$mark\n";
                $marks->{$id} = $mark++;
                print $fi "data $len\n$data\n";
                #print $fi "data $len\n";
                #print $fi pack('C', $data);
                #print $fi "\n";
            }
        } elsif ($e->{type} eq 'add_dir') {
            $add_dirs->{$e->{name}} = 1;
        } elsif ($e->{type} eq 'delete') {
            $delete_files->{$e->{name}} = 1;
        } elsif ($e->{type} eq 'rename') {
            $curtmp++;
            $from_tmpnames->{$e->{from_name}} = "__tmp_${tmptag}_$curtmp";
            $to_tmpnames->{$e->{to_name}} = "__tmp_${tmptag}_$curtmp";
        }
    }
    printf("rev $rev (%d/%d, %.2f%)\n",
           ++$c, scalar(@$sorted), 100*$c/scalar(@$sorted));
    print $fi "reset refs/import\n" unless @parents;
    lprint $fi, "commit refs/import\n";
    print $fi "mark :$mark\n";
    $marks->{$rev} = $mark++;
    if ($author =~ m(\s*(.*?\S)\s*<(.*)>\s*)) {
        # $author = $1;
        $email = $2;
    }
    # $author =~ s/[<>]/_/g;
    $email =~ s/[<>]/_/g;
    # $author =~ s/@.*//;
    $author = author_name($email);
    print $fi "committer $author <$email> $time +0000\n";
    my $len = length($msg);
    print $fi "data $len\n$msg\n";
    my $from = "from";
    for my $p (@parents) {
        # lprint $fi, "$from :$marks->{$p}\n";
        my $parent_mark = $marks->{$p};
        if ($parent_mark) {
            lprint $fi, "$from :$parent_mark\n";
            $from = "merge";
        }
    }
    for my $f (sort { length($b) <=> length ($a) } keys %$from_tmpnames) {
        lprintf($fi, "R %s %s\n",
                quote_file($f), quote_file($from_tmpnames->{$f}));
    }
    for my $f (sort { length($b) <=> length ($a) } keys %$delete_files) {
        lprintf($fi, "D %s\n", quote_file($f));
    }
    for my $f (sort { length($a) <=> length ($b) } keys %$to_tmpnames) {
        lprintf($fi, "R %s %s\n",
                quote_file($to_tmpnames->{$f}), quote_file($f));
    }
    for my $f (keys %$add_files) {
        lprintf($fi, "M 0644 :%s %s\n",
                $marks->{$add_files->{$f}}, quote_file($f));
    }
    for my $f (keys %$add_dirs) {
        $f .= "/" if $f;
        lprintf($fi, "M 0644 inline %s\n", quote_file("$f.gitignore"));
        lprint($fi, "data 0\n\n");
    }
    print $fi "\n";
}

my $branches = {};
for my $rev (@$leaves) {
    my $branch;
    my @certs;
    $m->certs(\@certs, $rev);
    for my $cert (@certs) {
        my ($n, $v) = ($cert->{name}, $cert->{value});
        $branch = $v if ($n eq 'branch');
    }
    my $r = $branches->{$branch};
    $branches->{$branch}--;
    if ($marks->{$rev}) {
        print $fi "reset refs/heads/$branch$r\n";
        print $fi "from :$marks->{$rev}\n\n";
    }
}

close $fi;

      parent reply	other threads:[~2008-11-11 16:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-24  9:18 [RFC] mtn to git conversion script Felipe Contreras
2008-08-24 13:14 ` Miklos Vajna
2008-08-24 18:19   ` Johannes Schindelin
2008-08-24 19:37     ` Miklos Vajna
2008-08-24 18:33   ` Felipe Contreras
2008-08-24 22:46     ` Shawn O. Pearce
2008-08-25  0:45       ` Felipe Contreras
2008-08-28  5:57         ` Anand Kumria
2008-08-28  9:03           ` [Monotone-devel] " Felipe Contreras
2008-09-04  9:43             ` Felipe Contreras
2008-09-04 10:31               ` Jakub Narebski
2008-09-04 13:21                 ` Felipe Contreras
2008-09-04 10:50               ` Thomas Moschny
2008-09-04 13:29                 ` Felipe Contreras
2008-08-25 16:35 ` Brian Downing
2008-08-25 16:41   ` Brian Downing
2008-08-25 20:47   ` Felipe Contreras
2008-08-25 21:09     ` Brian Downing
2008-08-28  7:11   ` Anand Kumria
2008-11-11 16:30   ` Juan Jose Comellas
2008-11-11 16:40   ` Juan Jose Comellas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c3be50f0811110840p7dff8972r2a5ef7193a0306c2@mail.gmail.com \
    --to=juanjo@comellas.org \
    --cc=bdowning@lavos.net \
    --cc=git@vger.kernel.org \
    --cc=monotone-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).