git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "James Sadler" <freshtonic@gmail.com>
To: "Jeff King" <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: git filter-branch --subdirectory-filter
Date: Sat, 10 May 2008 17:10:21 +1000	[thread overview]
Message-ID: <e5e204700805100010k4f1bee78y7d387d660cca3f35@mail.gmail.com> (raw)
In-Reply-To: <20080510055332.GB11556@sigill.intra.peff.net>

Excellent!  I'll give that a whirl, thanks.

- James.

2008/5/10 Jeff King <peff@peff.net>:
> On Sat, May 10, 2008 at 01:31:37PM +1000, James Sadler wrote:
>
>> Does anybody have a script that can take an existing repo, and create
>> a new one with garbled-but-equivalent commits?  i.e.  file and
>> directory structure is same with names changed, and there is a one-one
>> relationship between lines of text in new repo and old one except the
>> lines have been scrambled?  It would be a useful tool for distributing
>> private repositories for debugging reasons.
>
> This is only lightly tested, but the script below should do the trick.
> It works as an index filter which munges all content in such a way that
> a particular line is always given the same replacement text. That means
> that diffs will look approximately the same, but will add and remove
> lines that say "Fake line XXX" instead of the actual content.
>
> You can munge the commit messages themselves by just replacing them with
> some unique text; in the example below, we just replace them with the
> md5sum of the content.
>
> This will leave the original author, committer, and date, which is
> presumably non-proprietary.
>
> -- >8 --
> #!/usr/bin/perl
> #
> # Obscure a repository while still maintaining the same history
> # structure and diffs.
> #
> # Invoke as:
> #   git filter-branch \
> #     --msg-filter md5sum \
> #     --index-filter /path/to/this/script
>
> use strict;
> use IPC::Open2;
> use DB_File;
> use Fcntl;
> tie my %blob_cache, 'DB_File', 'blob-cache', O_RDWR|O_CREAT, 0666;
> tie my %line_cache, 'DB_File', 'line-cache', O_RDWR|O_CREAT, 0666;
>
> open(my $lsfiles, '-|', qw(git ls-files --stage))
>  or die "unable to open ls-files: $!";
> open(my $update, '|-', qw(git update-index --index-info))
>  or die "unable to open upate-inex: $!";
>
> while(<$lsfiles>) {
>  my ($mode, $hash, $path) = /^(\d+) ([0-9a-f]{40}) \d\t(.*)/
>    or die "bad ls-files line: $_";
>  $blob_cache{$hash} = munge($hash)
>    unless exists $blob_cache{$hash};
>  print $update "$mode $blob_cache{$hash}\t$path\n";
> }
>
> close($lsfiles);
> close($update);
> exit $?;
>
> sub munge {
>  my $h = shift;
>
>  open(my $in, '-|', qw(git show), $h)
>    or die "unable to open git show: $!";
>  open2(my $hash, my $out, qw(git hash-object -w --stdin));
>
>  while(<$in>) {
>    $line_cache{$_} ||= 'Fake line ' . $line_cache{CURRENT}++ . "\n";
>    print $out $line_cache{$_};
>  }
>
>  close($in);
>  close($out);
>
>  my $r = <$hash>;
>  chomp $r;
>  return $r;
> }
>



-- 
James

  reply	other threads:[~2008-05-10  7:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-09  1:01 git filter-branch --subdirectory-filter James Sadler
2008-05-09  1:33 ` Jeff King
2008-05-09  7:38   ` James Sadler
2008-05-09  7:57     ` Johannes Sixt
2008-05-09  8:00     ` Jeff King
2008-05-10  3:31       ` James Sadler
2008-05-10  5:53         ` Jeff King
2008-05-10  7:10           ` James Sadler [this message]
2008-05-10 11:38           ` James Sadler
2008-05-10 11:44             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5e204700805100010k4f1bee78y7d387d660cca3f35@mail.gmail.com \
    --to=freshtonic@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).